this post was submitted on 24 Aug 2023
250 points (94.3% liked)

Technology

59438 readers
4304 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Stephen King: My Books Were Used to Train AI::One prominent author responds to the revelation that his writing is being used to coach artificial intelligence.

you are viewing a single comment's thread
view the rest of the comments
[–] BetaDoggo_ 2 points 1 year ago

Obviously restricting the input will cause the model to overfit, but that's not an issue for most models where Billions of samples are used. In the case of stable diffusion this paper had a ~0.03% success rate extracting training data after 500 attempts on each image, ~6.23E-5% per generation. And that was on a targeted set with the highest number of duplicates in the dataset.

The reason they were sold doesn't matter, as long as the material isn't being redistributed copyright isn't being violated.