this post was submitted on 06 Sep 2024
1730 points (90.2% liked)
Technology
63116 readers
4340 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
What about companies who scrape public sites for training data but then publish their trained models open source for anyone to use?
That feels a lot more reasonable and fair to me personally.
If they still profit from it, no.
Open models made by nonprofit organisations, listing their sources, not including anything from anyone who requests it not to be included (with robots.txt, for instance), and burdened with a GPL-like viral license that prevents the models and their results from being used for profit... that'd probably be fine.
And also be useless for most practical applications.
We're talking about LLMs. They're useless for most practical applications by definition.
And when they're not entirely useless (basically, autocomplete) they're orders of magnitude less cost-effective than older almost equivalent alternatives, so they're effectively useless at that, too.
They're fancy extremely costly toys without any practical use, that thanks to the short-sighted greed of the scammers selling them will soon become even more useless due to model collapse.