this post was submitted on 06 Sep 2024
1728 points (90.1% liked)
Technology
61966 readers
3747 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The problem isn't that it does it regularly, but that it can do it, meaning that the copyrighted works are reproducible, regardless of how much the interface tries to hide that. That means the model isn't really "learning" the same way a human would in any capacity (that should be obvious), but that it's storing data that would violate fair use, and could generate copyright-violating portions of works.
Humans read and don't retain the originals. The argument is that LLMs retain the originals, and that's where the issue lies.