Fix Hand Stability in Stable Diffusion

Stable Diffusion has a hand problem. It is pretty common to see deformed hands or missing/extra fingers. In this article, we will go through a few ways to fix hands.

We will cover fixing hands with

  • Basic Inpainting
  • DW Pose ControlNet
  • Hand Refiner with Mesh Graphormer

Software

We will use AUTOMATIC1111, a popular and free Stable Diffusion AI image generator. You can use this GUI on WindowsMac, or Google Colab.

If you are new to Stable Diffusion, check out the Quick Start Guide to get started.

Check out the Stable Diffusion courses to level up your skills.

You will need the ControlNet extension to follow this tutorial.

Basic Inpainting

This method uses the basic inpainting function. You can use Stable Diffusion 1.5 or XL with this method.

Step 1: Generate an image

On the txt2img page, generate an image. I will use the Realistic Vision v5. (A SD 1.5 model)

Prompt:

a cute woman showing two hands, christmas, palm open, town, decoration

  • Sampling method: DPM++ 2M Karras
  • Sampling Steps: 20
  • CFG scale: 7
  • Seed: -1
  • Size: 512×768 (Adjust the image size accordingly for SDXL)

The hands should be pretty bad in most images. This is typical for an SD 1.5 model.

Step 2: Inpaint hands

Now, let’s send this image to Inpainting by clicking the Send to Inpaint button under the image.

You should now be on img2img > Generation > Inpaint tab.

Mask the area you want to regenerate. For the best result, fix one hand at a time.

Set the Inpaint Area to Whole Picture.

Set denoising strength to 0.5. (Increase to change more)

This method requires some cherry-picking. Generate a few images and pick the best one to go forward with.

Here’s the one I chose.

Next, repeat the same process for the other hand. Use the Send to Inpaint button and redraw the mask.

You can also apply it to one finger to adjust its length.

After a few rounds of inpainting the hands and fingers, I got the following image with repaired hands.

ControlNet DWPose Inpainting

DWPose is a powerful preprocessor for ControlNet Openpose. It can extract human poses, including hands.

The trick is to let DWPose detect and guide the regeneration of the hands in inpainting. I will explain how it works.

You will need the ControlNet extension and OpenPose ControlNet model to apply this method. See this post for installation.

Step 2: Inpaint with DWPose

Mask the area you want to regenerate, which is the hands in this case. You should include a bit of the surrounding areas.

Set Inpaint area to Whole Picture.

It is important to set a size compatible with SD 1.5. It is 512×768 in this case for a 2:3 aspect ratio.

Set the denoising strength to 0.7.

  • Enable: Yes
  • ControlType: OpenPose
  • Preprocessor: dw_openpose_full
  • Model: control_v11p_sd15_openpose

Click Generate.

inpainted with dw pose.

While not perfect, you should at least get hands with 5 fingers!

How does it work? Turning on ControlNet in inpainting uses the inpaint image as the reference. The DW OpenPose preprocessor detects detailed human poses, including the hands.

It made the correct assumption that a hand has 5 fingers. However, the lengths and joints can still be incorrect.

Refining the hands

To fix the hands further, you can do another round of inpainting.

If a finger is too long, you can mask only the part you don’t want to erase.

Turn off the ControlNet and click Generate.

Now you get a shorter thumb.

Mask the more of the surrounding area.

Increase the denoising strength to 0.85 and generate a new image.

Now you get a better hand.

You can do multiple rounds of this refinement by turning ControlNet on and off.

Upscaling

Finally, doing a standard ControlNet Tile Upscale can mitigate some issues coming from low resolution.

Send the image to the img2img tab by clicking the Send to Img2img button.

You should now be in img2img > Generation > img2img tab.

Set Resize by to 2. We are upscaling the image 2x to 1536×1024 pixels.

Set denoising strength to 0.9.

In the ControlNet section, set:

  • Enable: Yes
  • Control Type: Tile/Blur
  • Preprocessor: tile_resample
  • Model: control_v11f1e_sd15_tile

Generate an image. You add some details to the hands to make it look more realistic.

Here’s a comparison between the original and the final image.

You can also use this method in ComfyUI. However, the interactive nature of this approach makes it hard to use in ComfyUI effectively.

Applying to SDXL models

This inpainting method can only be used with an SD 1.5 model.

If you have generated an image with an SDXL model, you can work around it by switching to an SD 1.5 model for inpainting.

Make sure to scale the image back to a size compatible with the SD 1.5 model. (e.g. 768 x 512)

Hand Refiner

HandRefiner is a ControlNet model for fixing hands.

Hand Refiner pipeline
The Inference pipeline of HandRefiner. (credit: research article)

The preprocessor uses a hand reconstruction model, Mesh Graphormer, to generate the mesh of restored hands. It then converts the mesh to a depth map.

The ControlNet model is a depth model trained to condition hand generation. You can also use the standard depth model, but the results may not be as good.

Using HandRefiner in AUTOMATIC1111

Download the HandRefiner ControlNet model. Put the file in the folder stable-diffusion-webui > models > ControlNet.

After generating an image, use the Send to Inpaint button to send the image to inpainting.

Mask the area you want to regenerate, which is the hands in this case. You should include a bit of the surrounding areas.

Set Inpaint area to whole picture.

It is important to set a size compatible with SD 1.5. It is 512×768 in this case for a 2:3 aspect ratio.

Set the denoising strength to 0.75.

  • Enable: Yes
  • ControlType: OpenPose
  • Preprocessor: depth_hand_refiner
  • Model: control_sd15_inpaint_depth_hand_fp16

Update the ControlNet extension if you don’t see the preprocessor.

Click the Refresh button next to the Model dropdown if you don’t see the control model you just downloaded.

Click Generate. Now, you get fixed hands!

You should also get the control image. Check to see if it looks right.

Related Posts