Modern Hell #29: Echoes of Ourselves
We spent years putting everything online. Now that data is being used to create a future nobody wants.
Last week, the Washington Post published details of its analysis of the data sets that train Large Language Models (LLMs). The Post was unable to examine the data used to train ChatGPT because OpenAI hasn’t disclosed its sources, but it did analyze Google’s C4 data set, “a massive snapshot of the contents of 15 million websites that have been used to in…