this post was submitted on 21 Aug 2024
132 points (89.8% liked)
Technology
59648 readers
4957 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This is the logical endpoint for all the people who were complaining that scraping the open web for training is somehow immoral/illegal. Instead of stopping AI those with deep pockets will continue to train on everything while open source and small company efforts will be locked out.
Useful AI will be focused and narrow unless they actually achieve AGI.
Scraping literally the whole internet for inspiration is part of the reason they come up with utter rubbish. No one's actually scrutinizing what their ingesting. It's not so much a problem that they violate copyright it's more an issue that because they do it in this manner their output is garbage.
If these AI companies actually did some content curation we might get decent AI out of it.