BURN

joined 2 years ago
MODERATOR OF
[–] BURN -2 points 1 year ago* (last edited 1 year ago) (5 children)

Copying copyright protected data is theft AND stealing

Edit: this also applies to my stance on piracy, which I don’t engage in for the same reason. It’s theft

[–] BURN 6 points 1 year ago (1 children)

Ok and?

That doesn’t mean it’s any less theft, or that you have any idea what you’re talking about.

https://www.rws.com/blog/large-language-models-humans/

https://www.lesswrong.com/posts/rjghymycfrMY2aRk5/llm-cognition-is-probably-not-human-like

There’s also countless papers on google scholar that point out the differences.

[–] BURN 8 points 1 year ago (4 children)

I could say the same about you, considering I’ve watched you peddle false information for months about this subject.

AI learns differently than humans. That isn’t a fact up for debate. That’s one of the few objective truths around this industry.

[–] BURN 2 points 1 year ago

The entire airline industry runs on antiquated tech.

Between new certifications being needed for everything, and an attitude of “if it ain’t broke, don’t fix it”, combined with the constant attempts to save money, airplanes are rarely updated.

[–] BURN 6 points 1 year ago (6 children)

Backed by technical facts.

AIs fundamentally process information differently than humans. That’s not up for debate.

[–] BURN 1 points 1 year ago (8 children)

That does nothing to solve the problem of data being used without consent to train the models. It doesn’t matter if the model is FOSS if it stole all the data it trained on.

[–] BURN 5 points 1 year ago (9 children)

AI does not “read books” and it’s completely disingenuous to compare them to humans that way.

[–] BURN 0 points 1 year ago (3 children)

It is stealing data. In order to train on it they have to store the data. That’s a copyright violation. There’s no way to interpret it as not stealing data.

[–] BURN 1 points 1 year ago (5 children)

No

Why are you entitled to use everyone else’s work? It should be secured in law that licensing applies to training data to avoid frivolous discussions like this. Then it’s an entirely opt-in solution, which works in the benefit of everyone except the people stealing data.

Output doesn’t matter since it’s pretty well settled it’s not derivative work (as much as I disagree with that statement).

[–] BURN 3 points 1 year ago (7 children)

You don’t need to prove a financial difference. They are fundamentally different systems that function in different ways. They cannot be compared 1:1 and laws cannot be applied as a 1:1. New regulations need to be added around AI use of copyrighted material.

[–] BURN 3 points 1 year ago

Yes you would need permission. Just because you’re a hobbyist doesn’t mean you’re exempt from needing to follow the rules.

As soon as it goes beyond a completely offline, personal, non-replicatible project, it should be subject to the same copyright laws.

If you purely create a data agnostic AI model and share the code, there’s no problem, as you’re not profiting off of the training data. If you create an AI model that’s available for others to use, then you’d need to have the licensing rights to all of the training data.

[–] BURN 8 points 1 year ago (4 children)

Training is theft imo. You have to scrape and store the training data, which amounts to copyright violation based on replication. It’s an incredibly simple concept. The model isn’t the problem here, the training data is.

view more: ‹ prev next ›