I have a local installation of Vladmandic's fork of the Automatic1111 web UI, so these steps are specific to Stable Diffusion. They will also work fine if you have a good colab.
1. Find a good base image
- Pick the right model. I use A-Zovya RPG Artist Tools mostly, but there are many other models that are great for more specific styles.
- Start with a simple prompt that includes general details about your subject. Don't try to go down the Midjourney-style rabbit hole of crafting the perfect prompt; quantity is far more important than quality with current AI models.
- Use txt2img to generate an image that will serve as a good base. You're looking for something that has the right colors, silhouette, and style. Don't worry about the fine details like fingers and faces: you'll clean those up later.
- Generate just a few images with your initial prompt to see what sort of results you get: batches of 5-10 should be enough to tell you if you're using the right tokens. If you see something frequently popping up that you don't want, add it to the negative prompt. Change your prompts around until your results start consistently including the details you're looking for.
- If you find an image that you like the style of but not the details, you can use that image as an input for ControlNet's Reference model in txt2img.
Here's the image I chose as my base for the tiefling. Generation parameters:
Female tiefling, sorcerer, librarian, goat horns, watercolor, masterpiece, best quality
Negative prompt: bad_prompt_version2, nude, nsfw, explicit, penis, nipples, sex, suggestive, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
Steps: 10, Sampler: UniPC, CFG scale: 7, Seed: 3178845724, Size: 512x512, Model hash: da5224a242, Model: aZovyaRPGArtistTools_v2, Version: 1dffd11
2. Inpaint
- Inpaint any details you don't like. I use the openOutpaint extension since it's much easier to pick out details.
- Make sure you use an inpainting model and don't forget to load a VAE! Some checkpoints have a VAE baked in to their base model but not their inpainting model. This will make your inpainting attempts look grey and generally awful. What VAE you choose honestly doesn't matter much: I use Grapefruit's since it makes nice vibrant colors, but if I'm getting faces that look too much like anime then I'll switch to the base 1.5 VAE.
- Keep inpainting until you have all the features you want (right number of fingers, right clothes, etc.). Don't worry about the really fine details like eyes and fingernails yet. This will be the longest step; take your time!
- OpenOutpaint also lets you outpaint, so do that now if you want.
- Once you've got everything right, upscale it 2-4x. Try out all the different upscaling models to see which one works best for this specific image.
- Start inpainting again. Your image is now much larger, and you should only inpaint in 512x512 sections, so you'll also need to alter your prompt specifically for what you are inpainting at that time.
- Just keep inpainting and upscaling as needed. There's no one way to do things from here; just tinker until you're happy with the results.
Here's most of the images that I generated for the tiefling. This took me about 12 hours and almost 1000 images.
I can wash clothes just fine; all I have to do is gather them up and throw them in the washer. It’s folding that’s the big problem. We’re basically living out of laundry baskets at this point.