Simplified Stable Diffusion Process

Fooocus is a free and open-source AI image generator based on Stable Diffusion. It attempts to combine the best of Stable Diffusion and Midjourney: open source, offline, free, and ease-of-use.

Fooocus has optimized the Stable Diffusion pipeline to deliver excellent images. You can spend less time on tweaking the settings and more time on creating the images you want.

In this post, we will cover

  • Pros and Cons of Fooocus
  • How to install Fooocus
  • Basic usage
  • Setting styles
  • Upscaling
  • Inpainting and outpainting
  • Using Image Prompt

Pros and Cons of Fooocus

The benefits of using Fooocus are

  • Easy to install
  • Easy to use
  • Generates high-quality images out of the box

The drawbacks of using Fooocus are

  • Not as customizable as AUTOMATIC1111 or ComfyUI
  • The functionality is not as extensive as other GUIs

How to install Fooocus

Minimum system requirement

You need an Nvidia card with 8GB of VRAM. Other setups may also work. See the full list of minimum requirements.

Windows

Follow these steps to install Fooocus on Windows.

  1. Download the zip file on this page.
  2. Put the zip file to the folder you want to install Fooocus
  3. Right-click on the zip file and select Extract All… to extract the files.

Double-click run.bat to start Fooocus.

It will download models the first time you run.

AMD GPU, Mac, Linux and Colab

You can also install Fooocus on AMD GPU, Mac, Linux, and Colab.

Using Fooocus

Fooocus is super easy to use. In the default mode, you enter the prompt and press Generate. (Ctrll+Enter on Windows. Cmd+Enter on Mac)

a dragon, snow, moon

The Fooocus GUI

It uses a default model, which is juggernautXL, a fine-tuned Stable Diffusion XL model. It is a general-purpose model capable of producing various styles.

Prompt expansion

You don’t need to write long and complicated prompts like those on popular image-sharing sites. Fooocus will expand your prompt with a GPT-2 based prompt engine.

For example, the prompt:

a dragon, snow, moon

is expanded into the following prompt under the hood.

a dragon, snow, moon, light, intricate, elegant, sharp focus, beautiful dynamic, highly detailed, very sleek, professional fine detail, cinematic, dramatic ambient bright colors, perfect, warm color, epic composition, striking, brave, attractive, elite, best, vivid, clear, coherent, advanced, creative, cute, artistic, trendy, cool, gorgeous, awesome

Advanced Settings

Selecting the Advanced checkbox brings up the advanced setting menu.

Performance settings

True to its design philosophy, even the advanced setting is pretty easy to understand.

The performance section.

  • Speed: A good balance, Perform 30 sampling steps.
  • Quality: Perform twice as many sampling steps.
  • Extreme Speed: Use LCM LoRA to reduce sampling steps.

As expected from the settings, Speed and Quality don’t differ much. Doing more than 30 steps gains a diminishing return for SDXL models.

The Extreme Speed setting generates lower-quality images. It is expected from the LCM-LoRA model.

Below is a comparison of generation time on a Windows system with an RTX4090 GPU card.

Speed 17.3 sec (1x)
Quality 25.2 sec (1.5x)
Extreme Speed 10.4 sec (0.6x)
Time for generating two 1024×1024 images.

I like the ease of using LCM-LoRA (Extreme Speed). Using LCM-LoRA in AUTOMATIC1111 requires changing the CFG scale and sampling step. And it is easy to forget. Fooocus takes care of all these with a single selection.

Aspect Ratios

Fooocus provides an extensive list of image sizes for you to choose from. Interestingly, there’s no way to enter an image size in the GUI.

There are many reasons you want to use a specific image size, e.g. compatibility with a Stable Diffusion model and for publishing needs.

To add an image resolution to the list, look for a file called config_modification_tutorial.txt in the Fooocus folder.

This is a template for the configuration file config.txt.

Rename config.txt to config.txt.original.

Make a copy of the file config_modification_tutorial.txt and rename it to config.txt.

Edit config.txt in a text editor (I use Notepad++).

Remove the explanatory note on top.

Add a new resolution to the list of “available_aspect_ratios”. For example:

    "available_aspect_ratios": [
        "704*1408",
        "704*1344",
        "768*1344",
        "768*1280",
        "832*1216",
        "832*1152",
        "896*1152",
        "896*1088",
        "960*1088",
        "960*1024",
        "1024*1024",
        "1024*960",
        "1088*960",
        "1088*896",
        "1152*896",
        "1152*832",
        "1200*800",
        "1216*832",
        "1280*768",
        "1344*768",
        "1344*704",
        "1408*704",
        "1472*704",
        "1536*640",
        "1600*640",
        "1664*576",
        "1728*576"
    ],

Restart Fooocus and you should see the new resolution added.

Style

In Fooocus, you don’t need to engineer a prompt to achieve a certain style. You use the Style menu to get there.

There are so many preset styles! You can visualize many of them in the SDXL style guide.

You can combine multiple styles. But many of them only have an effect when the the default styles are unchecked.

You can also add a negative prompt to dial in the image. For example, add “B&W” to the negative prompt (In Settings > Negative Prompt) to generate a color image.

Model

You can specify a checkpoint model and LoRA in the Model tab.

The path of models can be found or changed in config.txt in the Fooocus folder.

Upscale an image

To upscale an image in Fooocus:

  1. Select the Input Image checkbox.
  2. Under Upscale or Variation, select the upscale option you want.
  3. Press Generate.

Image variation

Like Midjourney’s V1/V2/V3/V4 function, you can generate variants of an image.

  1. Select the Input Image checkbox.
  2. Under Upscale or Variation, select the Vary option you want.
  3. Press Generate.

Here is how much change the Vary Subtle and Vary Strong options create. They are not that much.

Note: You can create image variations with the Extra Seed option in AUTOMATIC1111.

Image Prompt

You can use an image as an additional prompt like AUTOMATIC1111. But unlike AUTOMATIC1111, you don’t need to install extensions. It is part of Fooocus’ basic functionality.

To use Image Prompt, check the Input Image checkbox and select the Image Prompt tab.

Upload an image to one of the image slots.

You probably want to check the Advanced checkbox at the bottom of the page to enable editing more settings.

ImagePrompt

The default Image Prompt option is ImagePrompt.

The settings should look familiar to you if you have used ControlNet in AUTOMATIC1111.

  • Stop At: Stop the Image Prompt control at certain sampling steps. 0.5 means stopping after 15 steps for 30 sampling steps.
  • Weight: The strength of the Image Prompt control.

Increase either of them to increase the effect of the image prompt.

Using the prompt:

a chair that resembles a cat

PyraCanny

PyraCanny is a pyramid-based Canny edge control method. The higher resolution of SDXL images may cause a standard Canny algorithm to miss some details. This method detects edges hierarchically in multiple resolutions.

Use PyraCanny like Canny ControlNet to copy composition or human poses.

Prompt:

A woman

Upload an image and select PyraCanny.

CPDS

CPDS is a depth-based structure detection method. It copies the 3D composition of the image but not the lines. Similar to the Depth ControlNet, it changes the image more.

See a sample of CPDS below. It copies the composition but not the facial details, such as hairstyle and the direction in which she is looking.

FaceSwap

Face Swap is like IP-adapter Face in ControlNet. It copies the face in the reference image.

Here’s an example. This prompt is:

a woman, praying

Multiple Image Prompts

Like ControlNet in AUTOMATIC1111, you can use multiple image prompts in Fooocus.

Let’s illustrate with an example of using two Image Prompts:

  • FaceSwap – weight 0.5, stop at 0.9: Copy the face.
  • PyraCanny – weight 0.5, stop at 0.5: Copy the pose.

You often need to set the weights lower when using multiple image prompts. Otherwise, you may see artifacts like weird colors.

PyraCanny does a good job of copying the pose. The lower weight and stopping sets the control loose. It helped to generate a different background.

FaceSwap does an okay job of copying the face. You may be able to apply a stronger effect by increasing the weight and stopping.

Inpainting

Inpainting regenerates part of the input image. It is straightforward in Fooocus.

Check the Input Image and select Inpaint or Outpaint.

Upload an image you want to inpaint.

Use the paintbrush tool to mask over an area you want to regenerate.

Here’s a result.

The Improve Detail method keeps the input image more or less the same but improves fine details.

The Modify Content method lets you modify the masked area with a prompt. It’s similar to inpainting with a high denoising strength.

Inpaint additional prompt:

a woman with sunglasses

Outpainting

Outpainting extends an image in one or more directions.

Check the Input Image and select Inpaint or Outpaint.

Upload an image you want to inpaint.

In the Method dropdown menu, select Inpaint or Outpaint (default).

Select the desired Outpaint Direction.

Below is an example of an image outpainted in the lateral directions.

Original
Outpainted

Describe

The Describe function in Input Imageguesses a prompt of an image. It is similar to the Interrogate CLIP button in AUTOMATIC1111.

Upload an image to the canvas of the Describe tab and press Describe this image into Prompt.

The guessed prompt will appear in the prompt input box.

Sharing models with AUTOMATIC1111

If you have installed AUTOMATIC1111 or other Stable Diffusion GUI, you may want to share models between them to conserve disk space.

It can be done by editing the config.txt file in the Fooocus folder. At the top of the file, you can modify “path_checkpoints”, “path_loras”, etc to point to the existing locations of the models.

Alternative to Fooocus

You can consider the following alternative:

  • AUTOMATIC1111: The Stable Diffusion GUI with the most features. The de facto standard.
  • SD.Next: A more curated version of AUTOMATIC1111. Many must-have extensions are pre-installed.
  • ComfyUI: A node-based Stable Diffusion GUI. The learning curve is a bit steep but knowing it goes a long way.

Fooocus vs Midjourney

Midjourney is a popular and proprietary AI image generator. You can replicate many of Midjourney’s functions with Stable Diffusion.

Fooocus is designed as a Midjourney replacement. If you like the simplicity of MidJourney, you may also like Fooocus. Midjourney is a Discord-based image generator. I would say Fooocus has a better user interface.

See the feature comparison between Fooocus and Midjourney.

Thoughts on Fooocus

I’m a regular user of Stable Diffusion, Midjourney, and DALLE. I always appreciate the infinite tweak-ability of Stable Diffusion, the quality of Midjourney, and how accurately DALLE follows a prompt.

Fooocus fills the gap of being simple and easy to use.

I am sometimes reluctant to use Midjourney because of the hassle of dealing with the Discord interface. It is a bit difficult to tweak the prompt and settings.

Fooocus attempts to deliver a Midjourney experience with the added benefit of running locally, free of censorship, and free. As a bonus, it has a properly-designed GUI!

I will use Fooocus to get high-quality images quickly, such as the cover image of this post.

Related Posts