this post was submitted on 22 Aug 2023
787 points (95.7% liked)

Technology

59643 readers
3251 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

you are viewing a single comment's thread
view the rest of the comments
[–] dantheclamman 7 points 1 year ago (1 children)

Google AI search preview seems to brazenly steal text from search results. Frequently its answers are the same word for word as a one of the snippets lower on the page

[–] SMITHandWESSON 3 points 1 year ago* (last edited 1 year ago) (1 children)

What the article is explaining is cliff notes or snippets of a story. Isn't that allowed in some respect? People post notes from school books all the time, and those notes show up in Google searches as well.

I totally don't know if I'm right, but doesn't copyright infringement involve plagiarism like copying the whole book or writing a similar story that has elements of someone else's work?

[–] dantheclamman 2 points 1 year ago (1 children)

I don't know what's considered fair use here. But the point is it's taking words that aren't theirs, which will deprive websites of traffic because then people won't click through to the source article.

[–] SMITHandWESSON 1 points 1 year ago* (last edited 1 year ago)

Ok I get now. I can definitely see both sides of the argument, and it's not going to be easy to solve.

Copyright law needs to be updated to deal with all the new ways people and companies are using tech to access copyrighted material.