Stable Diffusion

4324 readers

30 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 1 year ago

MODERATORS

[email protected]

How do you guys write your prompt enhancers? (lemmy.dbzer0.com)

submitted 1 year ago by [email protected] to c/[email protected]

11 comments fedilink hide all child comments

Do you keep it simple? Just long enough? Go wild with it? How about embeddings, do you also use them?

The more I learn about this, the more I don't understand it. Outside of some basic enhancers (masterpiece, best quality, worst quality, and bad anatomy/hands etc. if I'm generating a human), I don't see any big improvements. Every combination gives different result; some look better, some look worse depending on the seed, sampler, etc. It's basically a matter of taste. Note that I only do illustrations/paintings so the differences might not be much. Do you keep tweaking your prompts or just settle with the prompts you've been using?

top 11 comments

sorted by: hot top controversial new old

[–] DrakeRichards 5 points 1 year ago (2 children)

I don’t bother with prompt enhancers any more. Stable Diffusion isn’t MidJourney; quantity is far more important than quality. I just prompt for what I want and add negative prompts for things that show up that I don’t want. I’ll use textual inversions like badhandv4 if the details look really bad. If the model isn’t understanding at all then I’ll use ControlNet.

[–] lemmywink 5 points 1 year ago* (last edited 1 year ago) (1 children)

Agreed, although too much quantity seemed to water down results quite a bit. Too many and i have to up the weights of nearly everything to 1.2-1.4, otherwise aspects I want to show start to drop off.

Anecdotally I found the best length to be about 75 positive tokens, though I'd recommend to never go over the 150 token limit if you can help it.

I have a canned negative prompt list that I use that is super long though, easily 200 tokens. Just a hodge podge of some of the things you listed: bad_anatomy and missing_limbs and missing_hands for example are crucial to have. Adding ugly with a weight over 0.8 has strange results, too I've found. Hope that helps!

[–] DrakeRichards 5 points 1 year ago

I meant the quantity of generated images, not the number of tokens. I rarely go over 50 tokens now. As you said, too many tokens and things start to interact in really odd ways. That’s why I’m not a fan of massive lists of negative tokens either; they are much more efficient as a textual inversion like badhandv4 or Easynegative.

However, I only use txt2img to get the rough composition of an image; most of my work is done in inpainting afterwards. If you’re looking to have good images just from txt2img then sometimes lots of tokens are necessary.

Just like traditional art though, this is all based on individual style. It’s important to use what works best for you.

[–] [email protected] 1 points 1 year ago (1 children)

Do you have any good beginner's guide for ControlNet?

[–] [email protected] 5 points 1 year ago (1 children)

Not them but I found this one helpful: https://stable-diffusion-art.com/controlnet/

[–] [email protected] 2 points 1 year ago

Thanks!

[–] [email protected] 5 points 1 year ago (1 children)

When I started I was just copying from online galleries like Civitai or Leonardo.ai, which gave me noticeable better images than what I have came up with myself before. However, it seemed to me that many of these images may also just have copied prompts without understanding what's really going on with them and I started to experiment for myself.

What I will do right now is to build my images "from ground up" starting with super basic prompts like "a house on a lake" and work from there. First adding descriptions to get the image composition right, then work in the style I'm looking for (photography, digital artwork, cartoon, 3D render, ...). Then I will work in enhancers and see what they change. I found that one has to be patient, only change one thing at a time and always do a couple of images (at least a batch of 8) to see if and what the changes are.

So, I still comb though image galleries for inspiration in prompting, but I will now most of the time just pick one keyword or enhancer and see what it does to my own images.

It is a long process that requires many iterations, but I find it really enjoyable.

[–] [email protected] 1 points 1 year ago

However, it seemed to me that many of these images may also just have copied prompts without understanding what’s really going on with them and I started to experiment for myself.

I thought the same. Some of the prompts are either conflicted or just unrelated. I've been playing around with less prompts too. Not an easy task since most of the time there's no good or bad changes, just different style and taste.

[–] Scew 4 points 1 year ago* (last edited 1 year ago) (1 children)

Prompting is basically the ability to be descriptive. Using 'enhancers' feels a lot like trying to replace an experienced author's descriptions with hashtags. Which isn't necessarily a bad thing because it's been trained with tags that are basically the equivalent of a hashtag... but mixing them together on your own to find out what effects they can create seems to be way more rewarding than just copy and pasting other peoples 'enhancers.'

As far as tweaking the prompt, it's a setting just like everything else. If you get one to spit out something you like, stick with it for awhile and test out tweaking other settings.

[–] [email protected] 3 points 1 year ago (1 children)

I've been minimizing the prompts I use ever since I felt the prompts other people use are too excessive. Though it's not easy trying out the prompts as sometimes it changes drastically, sometimes it's barely noticable, and not in a good or bad way, more like different "taste".

[–] Scew 1 points 1 year ago

Yep, I'm also a fan of short prompts. There's something satisfying about being able to achieve desired flavors/tastes with as few words as possible. The only times I seem to need to feed it more are if I'm running controlnets and don't want it to diverge too much from the original image.