this post was submitted on 17 May 2024
302 points (97.2% liked)

Technology

59118 readers
3534 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 4 points 5 months ago* (last edited 5 months ago)

On the other hand, there are things that a human artist is utterly awful at, that LLM-based generative AIs are amazing at. I mentioned that LLMs are great at producing works in a given style, can switch up virtually effortlessly. I'm gonna do a couple Spiderman renditions in different styles, takes about ten seconds a pop on my system:

Spiderman as done by Neal Adams:

Spiderman as done by Alex Toth:

Spiderman in a noir style done by Darwyn Cooke:

Spiderman as done by Roy Lichtenstein:

Spiderman as painted by early-19th-century American landscape artist J. M. W. Turner:

And yes, I know, fingers, but I'm not generating a huge batch to try to get an ideal image, just doing a quick run to illustrate the point.

Note that none of the above were actually Spiderman artists, other than Adams, and that briefly.

That's something that's really hard for a human to do, given how a human works, because for a human, the style is a function of the workflow and a whole collection of techniques used to arrive at the final image. Stable Diffusion doesn't care about techniques, how the image got the way it is -- it only looks at the output of those workflows in its training corpus. So for Stable Diffusion, creating an image in a variety of styles or mediums -- even ones that are normally very time-consuming to work in -- is easy as pie, whereas for a single human artist, it'd be very difficult.

I think that that particular aspect is what gets a lot of artists concerned. Because it's (relatively) difficult for humans to replicate artistic styles, artists have treated their "style" as something of their stock-in-trade, where they can sell someone the ability to have a work in their particular style resulting from their particular workflow and techniques that they've developed. Something for which switching up styles is little-to-no barrier, like LLM-based generative AIs, upends that business model.

Both of those are things that a human viewer might want. I might want to say "take that image, but do it in watercolor" or "make that image look more like style X, blend those two styles". LLMs are great at that. But I equally might want to say "show this scene from another angle with the characters doing something else", and that's something that human artists are great at.