this post was submitted on 04 Sep 2024
185 points (96.0% liked)
ChatGPT
8947 readers
1 users here now
Unofficial ChatGPT community to discuss anything ChatGPT
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
AI trainers curate the data they use for training. We've gone past the phase where people just dump Common Crawl onto a neural net and tell it "figure that out somehow!" That worked back when we had no idea what we were doing or what would produce passable results, nowadays we know what produces better results. "Model collapse" has been known as a potential problem for years. The studies demonstrating it use unrealistic training methodologies to force it to extremes, real training works to avoid it.
And finally, that "57% of content is AI-generated!" Headline that's been breathlessly spamming all the feeds? Grossly misleading, of course. The actual study found that 57% of the content in their sample that had been translated into other languages had been translated into three or more languages, which they interpreted as meaning it had been AI-translated.
People are so eager to click on "AI sucks and is dying!" headlines.