You put what you DON’T want to see in the negative prompt. It gives you an additional way to control text-to-image generation. Many people treat it as an optional feature in Stable Diffusion v1.5. Things changed with the release of Stable Diffusion v2. The negative prompt becomes indispensable.
In this post, I will walk through a few use cases of negative prompts, including modifying content and style. Then, I will demonstrate the importance of negative prompts in v2 models. I will demonstrate how to search for a universal negative prompt.
This is the second part of a series on negative prompts. Read the first part: How does negative prompt work.
Enter negative prompt
Many Stable Diffusion GUI or web services offer negative prompts. In AUTOMATIC1111 (install instruction here), you enter the negative prompt right under where you put in the prompt.
Use cases
I will go through a few examples of using negative prompts so you can get some idea of what can be done and how to tweak it. In this section, I will use the v1.5 base model, but the techniques apply to v1, v2, or SDXL models.
Removing things
The first obvious usage is to remove anything you don’t want to see in the image. Let’s say you have generated a painting of Paris on a rainy day.
You want to generate another one but with an empty street. You can use the same seed value, which specifies the image, and add the negative prompt “people”. You get an image with most people removed.
Note that the scene is similar but not the same as the original one. If you need the original one, you will need to use inpainting to remove the people while keeping the scene coherent painstakingly.
You may have noticed one person left in the above image. You can tell Stable Diffusion to try harder by adding emphasis to the negative prompt (people:1.3)
. That tells Stable Diffusion that the keyword people
is 30% more important now.
Keep in mind that while you can use keyword emphasis in AUTOMATIC1111, it is not universally supported by all services. Be sure to check with the one you are using before writing me an angry email…
Modifying images
You can nudge Stable Diffusion to make subtle changes with negative prompts. You don’t want to remove anything but to make slight changes to the subjects.
Let’s work on this base image:
Looks like it’s windy and the hairs are floating. Let’s use the negative prompt “windy” to keep the hair down.
Emma in the original image looked a bit… underdeveloped. Using the negative prompt “underage” makes her look more adult-like.
What if we are okay with the wind but want the hair to cover the ear? Let’s add a negative prompt, “ear,” with different emphasis factors. Below are three increasing emphases: 1.3, 1.6, and 1.9.
The ears are covered more by hair within all emphasis factors, but when the factor reaches 1.9, the composition of the image changes. A negative prompt could affect the diffusion process strongly.
Negative prompt with keyword switching
Now, what if you really want to use a high emphasis (ear:1.9)
? I don’t know your problem with ears, but I have a trick for you. You can use keyword switching to first use a meaningless word as a negative prompt and then switch to (ear:1.9) at a later sampling step.
Let’s pick the
as the meaningless, dud negative prompt. You can verify its uselessness by putting it in the negative prompt. You will get the same image as if you didn’t put anything. Now use this as a negative prompt:
[the: (ear:1.9): 0.5]
Since, I am using 20 sampling steps, what this means is using the
as the negative prompt in steps 1 – 10, and (ear:1.9)
in steps 11-20.
The reasoning is that the diffusion process is most important in the beginning steps. Later steps are only finer adjustments to details, such as hairs covering ears.
Now, what we have accomplished is nothing short of amazing.
- We can now use a much stronger emphasis
(ear:1.9)
without changing the composition. - We get an image much closer to the original one.
- The ear is covered.
Modifying styles
Negative prompts are not only useful for modifying the content but also for modifying the style. Why use negative prompts to change style? Sometimes, adding too much to a positive prompt confuses the diffuser. Imagine someone telling you to go to 77 (the token limit) places at the same time. It helps if they tell you what areas to avoid instead.
Sharpening
Instead of using the keywords “sharp”, and ‘focused’ in a prompt, you can use “blurry” in the negative prompt. The image does get sharper.
Photorealistic
Using the negative prompt painting, cartoon
makes it more photo-like.
If you want to keep the original composition, you can experiment with the keyword switching I mentioned earlier. Using [the: (painting cartoon:1.9): 0.3]
we get:
It’s much closer to the original but with added photorealism style.
Negative prompts are important for v2 models
Negative prompt with Stable Diffusion v2.1
Consistent with Max Woolf’s finding, my experience is that the negative prompt is very important for v2 models. Below, I used the positive prompt for generating realistic humans but with the Stable Diffusion 2.1 model.
a young female, highlights in hair, sitting outside restaurant, brown eyes, wearing a dress, side light
Adding just two to three negative prompts progressively improves the aesthetic of the images. I would say this is pretty near the quality of v1 models.
Negative prompt with Stable Diffusion v1.5
Let’s repeat the exercise on the v1.5 model.
The images come out pretty well without any negative prompts in v1.5. Adding the negative prompt ugly, deformed, and disfigured may improve things, but it is not as clear as in v2.1. It is as if the v1.5 model does not understand these words.
Why does a negative prompt become more important in v2?
This is an area I can only speculate on… but why not? The two changes in v2 are
- Use a larger OpenCLIP language model.
- Filtered out NSFW contents in training data.
The first suspect is switching from Open AI’s CLIP model to OpenCLIP. This affects the embeddings of the model. Open AI trained the CLIP model with proprietary data. If the data is highly curated that every person looks way above average, prompting “woman” would be the same as prompting “beautiful woman.” That would make prompting easier.
My second speculation is that NSFW images could also be highly aesthetic. It could be a failure of the filter, or its be the nature of the NSFW images. Excluding NSFW images also unintentionally biases the data towards the bad and ugly.
Boilerplate negative prompts in the v2 model
We have already touched on the importance of negative prompts in v2. Now, let’s find a good universal negative prompt.
Search for a good negative prompt
I will use the 2.1 model (512-pixel) for this test. The original images without a negative prompt are
Not bad, but it could be improved. Using our minimalist negative prompt, we immediately see improvements:
Adding underexposed
and overexposed
helps to make the images less flat.
It doesn’t hurt to add low contrast
.
Next, let’s test this popular negative prompt for v2 floating around on the internet:
ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft
I think it’s doing a decent job, although it may have modified the style slightly. This could be caused by the negative keywords blurry, blurred, grainy, draft
. Some styles could look just that. Deleting these keywords seems to get back closer to the original style.
Next, add the lighting keywords we just used (low contrast, underexposed, overexposed
). It does help the contrast and dynamic range.
Now, we arrive at the last negative prompt below by adding a few more negative keywords to avoid sampling bad art or newbie drawings. This is a pretty decent boilerplate negative prompt without affecting styles.
ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face
This is a huge improvement over the without negative prompts. You may want to take out low contrast, underexpose or overexpose if it is the style.
Universal negative prompt
We will put the universal negative prompt for v2 we just found to a battery of test to see how well it performs. As a recap, the universal negative prompt is
ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, bad anatomy, watermark, signature, cut off, low contrast, underexposed, overexposed, bad art, beginner, amateur, distorted face
Photograph style
Prompt:
A man walking around her neighborhood, highlight hair, detailed eyes, sharp focus, young face, perfect symmetric face, pupil reflecting surroundings, realistic skin, soft healthy skin
The universal negative prompt works nicely with a photograph-style image. The guy looks a league higher and had spent more time on his hair in the morning…
Amine style
Prompt:
anime style girl on battleground, holding a ninja sword, detailed eyes, perfect face
The universal negative prompt helped characters equally well in anime style. The subject stands better, more handsome, and more ready to fight as it seems. The ninja sword is straightened up and looks more dangerous.
Oil painting style
impressionist oil painting of a young man standing right next to a red tesla roadster by john sargent
The universal negative prompt helps both the Tesla and the guy. Instead of showing a flat-tired, beat-up car with a troubled teenager, it shows a shiny new car with a young man who looks a million bucks.
Conclusion
Looks like this v2 universal negative prompt works well under a variety of styles! This concludes our two-part series of negative prompts.