this post was submitted on 26 Apr 2024
5 points (77.8% liked)

Stable Diffusion

4328 readers
51 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 1 year ago
MODERATORS
 

This stuff moves so fast I really can't keep up and a lot of the research posted here goes a bit over my head. I'm looking for something that doesn't seem too out of the question given things like CLIPSeg. Is there some tool or library out there that will accept an image and a prompt and then generate a mask within the image that generally corresponds to the prompt?

For example, if I had a picture of an empty park and gave the prompt "little girl flying a kite" I should get back a mask vaguely in the shape of a child with a sort of blob mask in the sky for the kite. Of course from there I could use the mask to inpaint those things. I would really like to be able to layer an image kind of like Photoshop so it's not all-or-nothing and focus on one element at a time. I could do the masking manually but of course we all want fewer steps in our workflows.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 1 points 7 months ago (1 children)

I think there's a ControlNet preprocessor that does this.

[โ€“] BrianTheeBiscuiteer 1 points 7 months ago

Some segmentation stuff seems close but they only separate objects, they don't add objects. Best workflow I can think of with minimal editing is doing I2I, creating a mask for the new object, then using the mask on the original image.