this post was submitted on 28 Oct 2024
122 points (83.5% liked)

Technology

59599 readers
3092 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 0 points 3 weeks ago (3 children)

I see, so your argument is that because the training data is not stored in the model in its original form, it doesn't count as a copy, and therefore it doesn't constitute intellectual property theft. I had never really understood what the justification for this point of view was, so thanks for that, it's a bit clearer now. It's still wrong, but at least it makes some kind of sense.

If the model "has no memory of training data images", then what effect is it that the images have on the model? Why is the training data necessary, what is its function?

[–] [email protected] 4 points 3 weeks ago* (last edited 3 weeks ago)

the training data is not stored in the model in its original form,

It is not stored in the model, period. Same as you do not store the shape of the letters you're reading right now, not even the words, but their overall meaning. Remembering the meaning of what I write here, you can then produce words and letters again and you might be close but even with this short paragraph you'll find it very hard to make an exact replica. That's because you did not store it in its original form, not even compressed, you re-encoded it using your own understanding of language, of the world, of everything.

[–] [email protected] 2 points 3 weeks ago

Here's a video explaining how diffusion models work, and this article by Kit Walsh, a senior staff attorney at the EFF.

[–] [email protected] 2 points 3 weeks ago

I agree with what @[email protected] said here. My argument is the same than what you've already heard: since it doesn't take the original images, but rather learn from them, it acts as a human who also learns from many different images and it would make no sense to copyright all artists that a human is trained on. Also it's true that a human artist also has his own experience that also influence the art while the neural network only has the art, however, the ai artist will provide this personal experience. So imo you shouldn't consider image generations as plagiarism.

Though, I do agree that having people scraping your art to train a model on it is frustrating, even though it was already the case with people training on your art for their personal experience. In the case of a model it's way more similar to the original art pieces. I haven't made my mind on the ehtics of model training, but generating is not plagiarism in my opinion.

Anyway, my original stance was on generative ai to be used as art and not on it being plagiarism or not. Generative ai brings a say to make full pictures with minimal effort and some people generate hundreds of unoriginal similar images. Imo, since it is easy to have a final image, the artistic effort is elsewhere: the composition, originality of the subjects, mixing of new techniques: regional prompt, lora, controlnet, etc., mixing with other tools : photoshop, blender, animation, etc. You definitely can make art with generative ai, and it takes more time that it looks like. (Look up a video on comfyui, sdnext or invokeai to see example of workflows)