this post was submitted on 09 Jan 2024
530 points (98.2% liked)

Technology

59983 readers
2767 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

you are viewing a single comment's thread
view the rest of the comments
[–] BURN 5 points 11 months ago (10 children)

Too bad

Why do they have free reign to store and use copyrighted material as training data? AIs don’t learn as a human would, and comparisons can’t be made between the learning processes.

[–] SCB -1 points 11 months ago* (last edited 11 months ago) (1 children)

Why do you have free reign to do the same?

AIs don’t learn as a human would, and comparisons can’t be made between the learning processes.

I think you're going to have a hard time proving a financial distinction between them

[–] BURN 3 points 11 months ago (1 children)

You don’t need to prove a financial difference. They are fundamentally different systems that function in different ways. They cannot be compared 1:1 and laws cannot be applied as a 1:1. New regulations need to be added around AI use of copyrighted material.

[–] SCB 0 points 11 months ago (1 children)

I agree. For instance, it should be secured in law that you can train AI on anything, to avoid frivolous discussions like this.

Output is what should be moderated by law.

[–] BURN 1 points 11 months ago (1 children)

No

Why are you entitled to use everyone else’s work? It should be secured in law that licensing applies to training data to avoid frivolous discussions like this. Then it’s an entirely opt-in solution, which works in the benefit of everyone except the people stealing data.

Output doesn’t matter since it’s pretty well settled it’s not derivative work (as much as I disagree with that statement).

[–] SCB 2 points 11 months ago (1 children)

the people stealing data

No one is doing this

Output doesn’t matter since it’s pretty well settled it’s not derivative work

Cool, discussion over.

[–] BURN 0 points 11 months ago (1 children)

It is stealing data. In order to train on it they have to store the data. That’s a copyright violation. There’s no way to interpret it as not stealing data.

[–] 5too 0 points 11 months ago (1 children)

It is not stealing. The data is still there. It is, at worst, copyright violation.

[–] BURN 2 points 11 months ago (1 children)

Copyright violations is stealing

[–] ultranaut 0 points 11 months ago

Stealing means someone has been deprived of their property, which is not the case for copyright violations.

load more comments (8 replies)