Comprehensive Guide to Stable Diffusion Samplers

Many sampling methods are available in AUTOMATIC1111. Euler a, Heun, DDIM… What are samplers? How do they work? What is the difference between them? Which one should you use? You will find the answers in this article.

We will discuss the samplers available in AUTOMATIC1111 Stable Diffusion GUI. You can use this GUI on WindowsMac, or Google Colab.

What is Sampling?

The sampler is responsible for carrying out the denoising steps.

To produce an image, Stable Diffusion first generates a completely random image in the latent space. The noise predictor then estimates the noise of the image. The predicted noise is subtracted from the image. This process is repeated a dozen times. In the end, you get a clean image.

This denoising process is called sampling because Stable Diffusion generates a new sample image in each step. The method used in sampling is called the sampler or sampling method.

Sampling is just one part of the Stable Diffusion model. Read the article “How does Stable Diffusion work?” if you want to understand the whole model.

Below is a sampling process in action. The sampler gradually produces cleaner and cleaner images.

Images after each denoising step.

While the framework is the same, there are many different ways to carry out this denoising process. It is often a trade-off between speed and accuracy.

Noise schedule

You must have noticed the noisy image gradually turns into a clear one. The noise schedule controls the noise level at each sampling step. The noise is highest at the first step and gradually reduces to zero at the last step.

At each step, the sampler’s job is to produce an image with a noise level matching the noise schedule.

Noise schedule for 15 sampling steps.

What’s the effect of increasing the number of sampling steps? A smaller noise reduction between each step. This helps to reduce the truncation error of the sampling.

Compare the noise schedules of 15 steps and 30 steps below.

Noise schedule for 30 sampling steps.

Diffusion trajectory

What does diffusion look like in latent space? Well, an SDXL sample diffuses in a 65,536-dimensional latent space (!), so it is rather difficult to imagine.

I have projected the diffusion process in a two-dimensional space using PCA. The two principal components are shown below.

Consistent with the noise schedule, the diffusion takes larger steps initially and smaller steps as it approaches the end. The initial steps set the global composition of the image. The later steps refine the details.

The sample must navigate a complex landscape of probability distribution of training images to get to the final latent image.

Changing seeds

Changing the seed of the image generates a similar but different image. What actually happen in sampling? They take a slightly different trajectories but end at a similar place.

Changing prompt

Changing the prompt will drastically change the end point of the sample because you end up getting a very different image.

Samplers overview

At the time of writing, there are 19 samplers available in AUTOMATIC1111. The number seems to be growing over time. What are the differences?

Samplers in AUTOMATIC1111.

You will learn what they are in the later part of this article. The technical details can be overwhelming. So I include a birdseye view in this section. This should help you get a general idea of what they are.

Old-School ODE solvers

Let’s knock out the easy ones first. Some of the samplers on the list were invented more than a hundred years ago. They are old-school solvers for ordinary differential equations (ODE).

  • Euler – The simplest possible solver.
  • Heun – A more accurate but slower version of Euler.
  • LMS (Linear multi-step method) – Same speed as Euler but (supposedly) more accurate.

Ancestral samplers

Do you notice some sampler’s names have a single letter “a”?

  • Euler a
  • DPM2 a
  • DPM++ 2S a
  • DPM++ 2S a Karras

They are ancestral samplers. An ancestral sampler adds noise to the image at each sampling step. They are stochastic samplers because the sampling outcome has some randomness to it.

Be aware that many others are also stochastic samplers, even though their names do not have an “a” in them.

The drawback of using an ancestral sampler is that the image would not converge. Compare the images generated using Euler a and Euler below.

Euler a does not converge. (sample steps 2 – 40)
Euler converges. (sampling steps 2-40)

Images generated with Euler a do not converge at high sampling steps. In contrast, images from Euler converge well.

For reproducibility, it is desirable to have the image converge. If you want to generate slight variations, you should use variational seed.

Karras noise schedule

The samplers with the label “Karras” use the noise schedule recommended in the Karras article. If you look carefully, you will see the noise step sizes are smaller near the end. They found that this improves the quality of images.

Comparison between the default and Karras noise schedule.

DDIM and PLMS

DDIM (Denoising Diffusion Implicit Model) and PLMS (Pseudo Linear Multi-Step method) were the samplers shipped with the original Stable Diffusion v1. DDIM is one of the first samplers designed for diffusion models. PLMS is a newer and faster alternative to DDIM.

They are generally seen as outdated and not widely used anymore.

DPM and DPM++

DPM (Diffusion probabilistic model solver) and DPM++ are new samplers designed for diffusion models released in 2022. They represent a family of solvers of similar architecture.

DPM and DPM2 are similar except for DPM2 being second order (More accurate but slower).

DPM++ is an improvement over DPM.

DPM adaptive adjusts the step size adaptively. It can be slow since it doesn’t guarantee finishing within the number of sampling steps.

UniPC

UniPC (Unified Predictor-Corrector) is a new sampler released in 2023. Inspired by the predictor-corrector method in ODE solvers, it can achieve high-quality image generation in 5-10 steps.

k-diffusion

Finally, you may have heard the term k-diffusion and wondered what it means. It simply refers to Katherine Crowson’s k-diffusion GitHub repository and the samplers associated with it.

The repository implements the samplers studied in the Karras 2022 article.

Basically, all samplers in AUTOMATIC1111 except DDIM, PLMS, and UniPC are borrowed from k-diffusion.

Evaluating samplers

How to pick a sampler? You will see some objective comparisons in this section to help you decide.

Image Convergence

In this section, I will generate the same image using different samplers with up to 40 sampling steps. The last image at the 40th step is used as a reference for evaluating how quickly the sampling converges. The Euler method will be used as the reference.

Euler, DDIM, PLMS, LMS Karras and Heun

First, let’s look at the Euler, DDIM, PLMS, LMS Karras, and Heun as a group since they represent old-school ODE solvers or original diffusion solvers. DDIM converges at about the steps as Euler but with more variations. This is because it injects random noise during its sampling steps.

Image convergence of Euler, DDIM, PLMS, LMS Karras and Heun (Lower is better).

PLMS did not fare very well in this test.

LMS Karras seems to have difficulty converging and has stabilized at a higher baseline.

Heun converges faster but is two times slower since it is a 2nd order method. So we should compare Heun at 30 steps with Euler at 15 steps, for example.

Ancestral samplers

If a stable, reproducible image is your goal, you should not use ancestral samplers. All the ancestral samplers do not converge.

Ancestral samplers do not converge well (Lower is better).

DPM and DPM2

DPM fast did not converge well. DPM2 and DPM2 Karras performs better than Euler but again in the expense of being two times slower.

DPM adaptive performs deceptively well because it uses its own adaptive sampling steps. It can be very slow.

Convergence of DPM samplers (Lower the better).

DPM++ solvers

DPM++ SDE and DPM++ SDE Karras suffer the same shortcoming as ancestral samplers. They not only don’t converge, but the images also fluctuate significantly as the number of steps changes.

DPM++ 2M and DPM++ 2M Karras perform well. The Karras variant converges faster when the number of steps is high enough.

Convergence of DPM++ samplers (Lower the better).

UniPC

UniPC converges a bit slower than Euler, but not too bad.

Speed

Relative rending time of each method (lower the better)

Although DPM adaptive performs well in convergence, it is also the slowest.

You may have noticed the rest of the rendering times fall into two groups, with the first group taking about the same time (about 1x), and the other group taking about twice as long (about 2x). This reflects the order of the solvers. 2nd order solvers, although more accurate, need to evaluate the denoising U-Net twice. So they are 2x slower.

Quality

Of course, speed and convergence mean nothing if the images look crappy.

Final images

Let’s first look at samples of the image.

Euler
Heun
DDIM
PLMS
LMS Karras
Euler a
DPM2
DPM2 a
DPM2 a Karras
DPM2 Karras
DPM ++2M Karras
DPM++ 2S a Karras
DPM++ 2S a
DPM++ adaptive
DPM++ fast
DPM++ SDE Karras
DPM++ SDE
UniPC

DPM++ fast failed pretty badly. Ancestral samples did not converge to the image that other samplers converged to.

Ancestral samplers tend to converge to an image of a kitten, while the deterministic ones tend to converge to a cat. There are no correct answers as long as they look good to you.

Perceptual quality

An image can still look good even if it hasn’t converged. Let’s look at how quickly each sampler can produce a high-quality image.

You will see perceptual quality measured with BRISQUE (Blind/Referenceless Image Spatial Quality Evaluator). It measures the quality of natural images.

DDIM is doing surprisingly well here, capable of producing the highest quality image within the group in as few as 8 steps.

The image quality of DDIM, PLMS, Heun and LMS Karras (Lower the better).

With one or two exceptions, all ancestral samplers perform similarly to Euler in generating quality images.

The image quality of ancestral samplers (Lower the better).

DPM2 samplers slightly outperform Euler.

The image quality of DPM samplers (Lower the better).

DPM++ SDE and DPM++ SDE Karras performed the best in this quality test.

The image quality of DPM++ samplers (Lower the better).

UniPC is slightly worse than Euler in low steps but comparable to it in high steps.

The image quality of the UniPC sampler (Lower the better).

So… which one is the best?

Here are my recommendations:

  1. If you want to use something fast, converging, new, and with decent quality, excellent choices are
    • DPM++ 2M Karras with 20 – 30 steps
    • UniPC with 20-30 steps.
  2. If you want good quality images and don’t care about convergence, good choices are
    • DPM++ SDE Karras with 10-15 steps (Note: This is a slower sampler)
    • DDIM with 10-15 steps.
  3. Avoid using any ancestral samplers if you prefer stable, reproducible images.
  4. Euler and Heun are fine choices if you prefer something simple. Reduce the number of steps for Heun to save time.

Samplers Explained

You will find information on samplers available in AUTOMATIC1111. The inner working of these samplers is quite mathematical in nature. I will only explain Euler (The simplest one) in detail. Many of them share elements of Euler.

Euler

Euler is the most straightforward sampler possible. It is mathematically identical to Euler’s method for solving ordinary differential equations. It is entirely deterministic, meaning no random noise is added during sampling.

Below is the sampling step-by-step.

Step 1: Noise predictor estimates the noise image from the latent image.

Step 2: Calculate the amount of noise needed to be subtracted according to the noise schedule. That is the difference in noise between the current and the next step.

Step 3: Subtract the latent image by the normalized noise image (from step 1) multiplied by the amount of noise to be reduced (from step 2).

Repeat steps 1 to 3 until the end of the noise schedule.

Noise schedule

But how do you know the amount of noise in each step? Actually, this is something you tell the sampler.

A noise schedule tells the sampler how much noise there should be at each step. Why does the model need this information? The noise predictor estimates the noise in the latent image based on the total amount of noise supposed to be there. (This is how it was trained.)

A noise schedule for 15 sampling steps.

There’s the highest amount of noise in the first step. The noise gradually decreases and is down to zero at the last step.

Changing the number of sampling steps changes the noise schedule. Effectively, the noise schedule gets smoother. A higher number of sampling steps has a smaller reduction in noise between any two steps. This helps to reduce truncation errors.

From random to deterministic sampling

Do you wonder why you can solve a random sampling problem with a deterministic ODE solver? This is called the probability flow formulation. Instead of solving for how a sample evolves, you solve for the evolution of its probability distribution. This is the same as solving for the probability distribution instead of sample trajectories in a stochastic process.

Compared with a drift process, these ODE solvers use the following mappings.

  • Time → noise
  • Time quantization → noise schedule
  • Position → latent image
  • Velocity → Predicted noise
  • Initial position → Initial random latent image
  • Final position → Final clear latent image

Sampling example

Below is an example of text-to-image using Euler’s method. The noise schedule dictates the noise level in each step. The sampler’s job is to reduce the noise by just the right amount in each step to match the noise schedule until it is zero at the last step.

Denoising with Euler’s method and 15 sampling steps.

Euler a

Euler ancestral (Euler a) sampler is similar to Euler’s sampler. But at each step, it subtracts more noise than it should and adds some random noise back to match the noise schedule. The denoised image depends on the specific noise added in the previous steps. So it is an ancestral sampler, in the sense that the path the image denoises depends on the specific random noises added in each step. The result would be different if you were to do it again.

DDIM

Denoising Diffusion Implicit Models (DDIM) is one of the first samplers for solving diffusion models. It is based on the idea that the image at each step can be approximated by adding the following three components.

  1. Final image
  2. Image direction pointing to the image at the current step
  3. Random noise

How do we know the final image before we get to the last step? The DDIM sampler approximates it with the denoised image. Similarly, the image direction is approximated by the noise estimated by the noise predictor.

LMS and LMS Karras

Much like Euler’s method, the linear multistep method (LMS) is a standard method for solving ordinary differential equations. It aims at improving accuracy by clever use of the values of the previous time steps. AUTOMATIC1111 defaults to use up to 4 last values.

LMS Karras uses the Karras noise schedule.

Heun

Heun’s method is a more accurate improvement to Euler’s method. But it needs to predict noise twice in each step, so it is twice as slow as Euler.

DPM samplers

Diffusion Probabilistic Model Solvers (DPM-Solvers) belong to a family of newly developed solvers for diffusion models. They are the following solvers in AUTOMATIC1111.

  • DPM2
  • DPM2 Karras
  • DPM2 a
  • DPM2 a Karras
  • DPM Fast
  • DPM adaptive
  • DPM Karras

DPM2 is the DPM-Solver-2 (Algorithm 1) of the DPM-Solver article. The solver is accurate up to the second order.

DPM2 Karras is identical to DPM2 except for using the Karras noise scheduler.

DPM2 a is almost identical to DPM2, except noise is added to each sampling step. This makes it an ancestral sampler.

DPM2 a Karras is almost identical to DPM2 a, except for using the Karras noise schedule.

DPM Fast is a variant of the DPM solver with a uniform noise schedule. It is accurate up to the first order. So it is twice as fast as DPM2.

DPM adaptive is a first-order DPM solver with an adaptive noise schedule. It ignores the number of steps you set and adaptively determines its own.

DPM++ samplers are the improved versions of DPM.

UniPC

UniPC (Unified Predictor Corrector method) is a diffusion sampler newly developed in 2023. It consists of two parts

  • Unified predictor (UniP)
  • Unified corrector (UniC)

It supports any solver and noise predictors.

LCM

The LCM sampler should only be used with Latent Consistency Models (LCM). They are models trained to generate images in 1 step.

In reality, one step doesn’t work very well. Instead, LCM samplers add noise back to the image and denoise again. Here’s the process:

  1. Apply the LCM model to get the “final” image.
  2. Add back noise to match the noise schedule (sigma).
  3. Apply the LCM model to get a better “final” image.
  4. Repeat 2 and 3 until the end of the noise schedule.

Below is an illustration for 2 steps.

More readings

aizmin: