Wow this sucks so bad. a person posted on /r/datahoarder that they have created an archive of the Epstein files with added metadata like mentioned people and etc. But everything is LLM-generated.... Including the "full text" of the documents. Rather than OCRing them, they were fed to chatGPT with a system prompt that told it that it was an expert at OCR. image