An image with a transparent background is useful for downstream design work. You can use the image with different backgrounds without generating a new one. The Layer Diffusion model can generate transparent images with any Stable Diffusion v1.5 and SDXL models.
I will cover:
- Generating transparent images with SD Forge
- Generating transparent images with ComfyUI
- How does it work
Software
Currently, you have two options for using Layer Diffusion to generate images with transparent backgrounds.
- SD forge, a faster alternative to AUTOMATIC1111.
- ComfyUI, a node-based Stable Diffusion software.
If you are new to Stable Diffusion, check out the Quick Start Guide to decide what to use.
Check out the Stable Diffusion course for a self-guided course.
Alternatives to Layer Diffusion
You can also remove or change the background of an existing image with Stable Diffusion to achieve a similar effect.
In general, layer Diffusion generates higher-quality transparent backgrounds than removing them after the fact.
SD Forge
SD Forge is a faster alternative to AUTOMATIC1111. It looks very much like AUTOMATIC1111, but the backend has been reworked.
Follow the installation guide to set up the software on your local PC.
Updating SD Forge
You first need to update SD Forge to the latest version.
In the File Explorer App, open the webui_forge_cuxxx_torchxxx folder and double-click the update file to update the software.
Install the Layer Diffuse extension
- Start SD Forge normally.
2. Navigate to the Extension Page.
3. Click the Install from URL tab.
4. Enter the following URL in the URL for extension’s git repository field.
https://github.com/layerdiffusion/sd-forge-layerdiffuse
5. Click the Install button.
6. Wait for the confirmation message that the installation is complete.
7. Restart SD Forge.
Generate SD v1.5 images with transparent backgrounds
On the txt2img page, select an SD v1.5 checkpoint model in the Stable Diffusion checkpoint dropdown menu.
Let’s use the Realistic Vision v5 model.
- Checkpoint model: Realistic Vision v5.1
- Prompt:
a beautiful woman with messy hair, dress
- Negative prompt:
disfigured, deformed, ugly
- Sampling method: DPM++ 2M Karras
- Sampling Steps: 20
- CFG scale: 7
- Seed: -1
- Size: 512×768
In the LayerDiffuse section:
- Enable: Yes
- Method: (SD1.5) Only Generate Transparent Image (Attention Injection)
Click Generate to generate an image.
You should see two images generated. The one with a white background is the transparent image. The image with a checkered background is for inspection purposes only.
A common issue of the SD v1.5 model is garbled faces. When the face is not covered by enough pixels, it is not generated correctly.
Luckily, the Layer Diffuse extension is compatible with the high-res fix.
You can turn on Hires Fix to generate a larger image and fix the face.
As of this writing, the transparent layer generation is incompatible with img2img. You cannot use inpainting to fix a transparent image.
Generate SDXL images with transparent backgrounds
The steps for generating
- Checkpoint model: Juggernaut XL v7
- Prompt:
a beautiful woman with messy hair
- Negative prompt:
disfigured, deformed, ugly
- Sampling method: DPM++ 2M Karras
- Sampling Steps: 20
- CFG scale: 7
- Seed: -1
- Size: 1216×832
In the LayerDiffuse section:
- Enable: Yes
- Method: (SDXL) Only Generate Transparent Image (Attention Injection)
Tips
Layer Diffusion Models
There are multiple Layer Diffuse models available in the dropdown menu. The Attention Injection method usually works the best. If it doesn’t work for your image, try the Conv Injection method.
Semi-transparent objects
You can experiment with semi-transparent objects such as glasses. The model is smart enough to let the background come through them!
Other styles
The transparent images are not restricted to a photorealistic style. You can generate images in painting or anime style.
- Checkpoint Model: Dreamshaper v8
- Prompt:
close up of A beautiful girl, pink hair
BREAK
blue translucent dress
(I used BREAK to separate the color of the hair from the color of the dress. See the prompt guide.)
ComfyUI
You can generate images with transparent images in ComfyUI with the Layer Diffuse custom nodes.
Download the workflow
The following workflow generates images with transparent backgrounds. Download the workflow below and drop it to ComfyUI.
Installing missing nodes and updating ComfyUI
Every time you try to run a new workflow, you may need to do some or all of the following steps.
- Install ComfyUI Manager
- Install missing nodes
- Update everything
Install ComfyUI Manager
Install ComfyUI manager if you haven’t done so already. It provides an easy way to update ComfyUI and install missing nodes.
To install this custom node, go to the custom nodes folder in the PowerShell (Windows) or Terminal (Mac) App:
cd ComfyUI/custom_nodes
Install ComfyUI by cloning the repository under the custom_nodes folder.
git clone https://github.com/ltdrdata/ComfyUI-Manager
Restart ComfyUI completely. You should see a new Manager button appearing on the menu.
If you don’t see the Manager button, check the terminal for error messages. One common issue is GIT not installed. Installing it and repeat the steps should resolve the issue.
Install missing custom nodes
To install the custom nodes that are used by the workflow but you don’t have:
- Click Manager in the Menu.
- Click Install Missing custom Nodes.
- Restart ComfyUI completely.
Update everything
You can use ComfyUI manager to update custom nodes and ComfyUI itself.
- Click Manager in the Menu.
- Click Updates All. It may take a while to be done.
- Restart the ComfyUI and refresh the ComfyUI page.
Fixing library issues
Do the following if you run into an import error with the Layer Diffuse custom node.
Click Manager > Install PIP packages. Enter the following and press Enter.
diffusers>=0.25.0
Restart ComfyUI.
The import error should be fixed.
Generating an image with a transparent background
Now, the workflow should be ready to use.
Select an SDXL model in the Load Checkpoint node. The workflow uses the Juggernaut XL v7 model.
Click Queue Prompt to generate an image. You should see an image with a transparent background generated after the Layer Diffuse Decode node.
You can find more workflows, such as adding a background in the ComfyUI-layerdiffuse repository.
SD 1.5 workflow
If you are a site member, you can download the SD 1.5 Layer Diffuse workflow on the Resources page.
How does Layer Diffusion work?
The Layer Diffusion method is described in the research article Transparent Image Layer Diffusion using Latent Transparency by Lvmin Zhang and Maneesh Agrawala.
Background on transparent images
First, I will give a little background info on how a transparent image in PNG works.
An image in the JPEG format has three color channels: red, green, and blue. Or it is more commonly called RGB. Each pixel has these three primary color values. This is how different colors are created on a computer screen.
However, the RGB channels don’t define transparency. An extra alpha channel in a PNG image defines how transparent a pixel is. 0 is completely transparent, and a maximum value is fully opaque.
In other words, a PNG image has 4 channels: RBG and alpha. The transparency information is encoded in the alpha channel.
The Layer Diffuse model
The Layer Diffuse model includes a newly trained VAE encoder (E) and decoder (D) to encode the transparent image to a new latent transparency image. The VAE works with images with 4 channels: RBG and alpha.
This latent transparency image is added to the latent image of Stable Diffusion, effectively “hiding” in the original latent image without affecting its perceptual quality.
The U-Net noise predictor is also trained to predict the noise of transparent images. To enable transparent images with any custom models, the change to U-Net is stored as a LoRA model.
The authors also released models that can add background to foreground, foreground to background, etc.
Text-to-image
Here’s how text-to-image with Layer Diffusion works.
- A random tensor (image) is generated in the latent space.
- The U-Net noise estimator, modified with a Layer Diffusion LoRA, predicts the noise of the latent image at each sampling step.
- The expected noise is subtracted from the latent image.
- Steps 2 and 3 are repeated for each sampling step.
- At the end of the sampling steps, you get a latent image encoding an image with a transparent background.
- A special VAE decoder converts the latent image to a pixel image with the RGB and alpha channels.
If you want to learn more about how Stable Diffusion and sampling work, I have these two articles written previously.