npub1uhn7...rerk 3 months ago

What ingests large quantities of data, applies some math and outputs a much smaller set of data that can still be used to reconstruct what was ingested? Compression algorithms, right? And some of them compress *so much* that data can only be extracted partially and the rest needs to be "guessed" - like JPEGs. LLMs ingest TB after TB and compress that into a few miserable GB. And that's why the info they produce is like a JPEG with a compression factor of 1000 and more.