aizmin in Tutorial

Choosing Between Midjourney and Stable Diffusion

Midjourney is a web service that makes stunning AI pictures using words. It’s similar to Stable Diffusion, but there are some differences. Midjourney can only be used on the internet, and you have to pay for it. So, is it worth paying for Midjourney? And how is it different from Stable Diffusion? Let’s find out.

Midjourney vs Stable Diffusion – Feature comparison

You will find a detailed comparison between Stable Diffusion and Midjourney in this section. Unlike Midjourney, there are multiple ways to use Stable Diffusion. I will confine my analysis to using AUTOMATIC1111, a popular GUI for Stable Diffusion.

Like Midjourney, you can use AUTOMATIC1111 as a web service (e.g. Google Colab). You can also use it locally on Windows PC and Mac. New to Stable Diffusion? Check out the Quick Start Guide.

You will see image comparisons throughout the article. I tweaked the prompts and selected models in each case to optimize the images. So they are not direct comparisons of the same prompts but more like attempts to generate similar pictures of various styles.

Midjourney (v4)

Stable Diffusion (v1.5)

Here’s the summary of the comparison.

Stable Diffusion (AUTOMATIC1111)	Midjourney
Image Customization	High	Low
Ease of getting started	Low	Medium
Ease of generating good images	Low	High
Inpainting	Yes	No
Outpainting	Yes	No
Aspect ratio	Yes	Yes
Model variants	~1,000s	~ 10s
Negative prompt	Yes	Yes
Variation from a Generation	Yes	Yes
Control composition and pose	Yes	No
License	Permissive. Depends on the model used	Restrictive. Depends on the paid tier
Make your own model	Yes	No
Cost	Free	$10-$60 per month
Model	Open-sourced	Proprietary
Content Filter	No	Yes
Style	Varies	Realistic illustration, artistic
Upscaler	Yes	Yes
Image Prompt	No	Yes
Image-to-image	Yes	No
Prompt word limit	No limit	?

Image customization

There are more ways to customize an image in Stable Diffusion, such as changing the image size, how closely the prompt should be followed, the number of images generated, the seed value, samplers, etc. The options are fewer in Midjourney. You can change the aspect ratio, the seed and whether to stop early.

Verdict: Stable Diffusion wins.

Easy to Get Started

AUTOMATIC1111 is a bit hard to install. After you it up and running, you will still need to find and install models to get the styles you want.

Midjourney is not as user-friendly as it should be, mainly because of their choice of using Discord as an interface. But it’s still ten times easier to get started.

Pro tip: Want to hide other people’s generations? Create a new private server and invite the Midjourney bot. And you can generate images in peace.

Verdict: Midjourney wins.

Midjourney (v5)

Stable Diffusion (DreamShaper)

Easy to generate good images

Midjourney is well-known for being surprisingly easy to generate artistic images with a lot of fine details. You don’t need to work very hard to generate good images. If fact, very often, it will ignore part of your prompt and deliver surprising aesthetic images.

A Stable Diffusion user needs to put more work into building a good prompt and experiment with models to generate an image of similar quality.

Verdict: Midjourney wins.

Prompt

Both Stable Diffusion and Midjourney support prompt and negative prompt. Both can add weight to any keywords in a prompt. You can do slightly more prompt tricks with AUTOMATIC1111, such as blending two keywords.

Verdict: Tie.

Midjourney (v4)

Stable Diffusion (Protogen)

Models varieties

Stable Diffusion is an open-source model. People have made models of different styles. There are currently more than a thousand models available for download. Each model can be further modified with LoRA models, embedding models, and hypernetworks. The end result is there are more models than you have time to try.

Midjourney’s models are limited in comparisons. They offer v1 to v5 models, plus a few special models like niji, test, testp and HD. There is an additional parameter you can “stylize” the image. But the overall offerings dwarf Stable Diffusion.

Verdict: Stable Diffusion wins.

Image editing

You can use Stable Diffusion to edit a generated image in many ways. This includes regenerating only part of an image with inpainting, and extending an image through outpainting. You can also simply tell Stable Diffusion what you want to change using the instruct-pix2pix model.

Sadly, you cannot edit an image with Midjourney.

Verdict: Stable Diffusion wins.

Midjourney (v5)

Stable Diffusion (Dreamlike Photoreal)

Style

Midjourney v4 produces images with a realistic illustration style by default. It can also generate other art styles when prompted correctly. A realistic photo is possible in the v5 model.

Stable Diffusion can generate a broader range of styles ranging from realistic photos to abstract art, thanks to the passionate community and ease of training new models. Users can remix models with embeddings, LoRAs, or hypernetowrks. It can produce surprising effects and is fun to play with.

Verdict: Stable Diffusion wins.

Variation from a generation

Both offers generate slight variations of a generated image. You press the V buttons under the images in Midjourney. You use the variational seed option in AUTOMATIC1111.

Verdict: Tie.

Input image

Output image

Control composition and pose

You can control composition and pose in Stable Diffusion in multiple ways: Image-to-image, depth-to-image, instruct-pix2pix and controlNet. In Midjourney, the closest option is using an image prompt which acts like a text prompt to control image generation.

Verdict: Stable Diffusion wins.

Cost

Using Stable Diffusion with AUTOMATIC1111 can be free using your own computer. In contrast, using Midjourney would set you back at least $10 a month.

Verdict: Stable Diffusion wins.

License

Many people are unaware that the ownership of the images you generate using Midjourney depends on your paid tier. You own nothing if you are not a paid subscriber. You have more rights if you pay more. In any case, Midjourney can use your images without asking you first. See their terms of service.

In contrast, Stable Diffusion claims no right to the images you generate. You are allowed to distribute and further train the model and even sell it. However, models further fine-tuned by others may have additional restrictions. So be sure to read the license and terms of use when you use a new model.

Verdict: Stable Diffusion wins.

Midjourney (v4)

Stable Diffusion (DreamShaper)

Content Filter

There is a content filter in the original Stable Diffusion v1 software, but the community quickly shared a version with the filter disabled. So in practice, there’s no content filter in the v1 models. v2 is trickier because NSFW content is removed from the training images. It cannot generate explicit content by design. In contrast, generating explicit images are off limit in Midjourney. It is blocked even at the prompt level. You can get banned if you try.

Verdict: Stable Diffusion wins.

Making your own models

Perhaps the biggest appeal of Stable Diffusion is the possibility of making your own models. If you don’t like the images you see, you can always train your own model. You can use dreambooth, textual inversion, LoRA, hypernetwork, or simply run additional rounds of training with your own images. Unfortunately, you cannot do that with Midjourney.

Verdict: Stable Diffusion wins.

Upscaler

Both Stable Diffusion and Midjourney have upscalers. The choices and parameters available in AUTOMATIC1111 are more. In fact, you can install additional ones easily.

Verdict: Stable Diffusion wins.

Image Prompt

You can use an image as a prompt together with a text prompt in MidJourney. It will generate a combination of the content of the image prompt and the text prompt. That’s not the same as image-to-image in Stable Diffusion, where the input image acts as an initial image but is not used in conditioning. The closest thing Stable Diffusion will have is Stable Diffusion Reimagine, which uses an input image as conditioning in place of the text prompt.

Verdict: Midjourney wins.

Image-to-image

Currently, Midjourney offers no image-to-image functionality, a method for diffusion models to generate images based on another image. This is unsurprising since the earlier versions of Midjourney may not be diffusion models.

Verdict: Stable Diffusion wins.

Prompt limit

Midjourney used to state there was about 60 words limit for the prompt in their user guide. But they removed that statement. On the other hand, AUTOMATIC1111 now supports unlimited prompts length.

Verdict: Not clear.

Is Midjourney using Stable Diffusion?

Midjourney v5 model is not Stable Diffusion. That’s all they said. However, the improvements to v5 look suspiciously similar to Stable Diffusion v2: The prompt needs to be more literal and specific. People are getting five fingers… Could Midjourney share some components of Stable Diffusion v2, like the OpenClip text embedding? It certainly makes sense to use a diffusion model because of the lower run costs.

Is Midjourney better than Stable Diffusion?

I don’t want to give a diplomatic answer but it really depends on what you are looking for.

Midjourney has its own unique style – high contrast, good lighting, and realistic illustration. It’s super easy to create images with crazy amounts of detail. You can get good images without trying very hard.

On the other hand, Stable Diffusion can also create similar or better images, but it requires a bit more know-how. So, if you’re up for a challenge and want to dive deep into the technical side of things, then Stable Diffusion is the perfect fit for you.

How does Midjourney differ from Stable Diffusion?

You can read the first section for a point-by-point comparison. The main difference lies in the operating model and the users they cater to.

Midjourney chose a proprietary business model. They take care of the model development, training, tweaking and the user interface. Everything should be simple and works out-of-box. You tell the model what you want, and you get it.

Stable Diffusion is a software that embraces an open-source ecosystem. The model’s codes and training data are available for everyone to access. You can build on it and fine-tune the model to achieve exactly what you want. And guess what? People have already done that! There are thousands of models that have been publicly created and shared by users just like you.

But that’s not all. New and amazing tools are being created every week, and it never ceases to amaze me how creative people can be when given the opportunity to do so.

Midjourney (v5)

Stable Diffusion (Realism Engine)

Generating a Midjourney image in Stable Diffusion

Recreating a Midjourney image in Stable Diffusion is tricky but possible. I use the following workflow.

Use the same prompt to see what you get. You can start with the v1.5 base model. The result is usually very different.
Adjust the keywords of the prompt. You will likely find that Midjourney ignores some keywords and takes the liberty of adding others. I usually look at the keywords in the prompt generator to see how to achieve the same effect.
You will likely want to add a negative prompt (The universal one is usually fine).
You will definitely need to add some lighting keywords. Pay attention to the contrast and luminosity. Choose the lighting keywords that can achieve a similar effect.
Since Midjourney images are on the darker side, you may want to add a LoRA like epi_noiseoffset.
Finally, experiment with different models and adjust the tweak prompt.

And use ControlNet if you want to copy the composition.

I will write another article to detail the process step-by-step. Stay tuned!

Which one should I use?

Midjourney and Stable Diffusion both have a large user base. They have their strengths and weaknesses.

Midjourney is for you if

You want to generate stunning images without a deep learning curve.
You are busy and cannot afford the time to set up and learn the models.
You like the Midjourney styles.
You are looking for an out-of-box AI image solution.
You don’t mind paying a subscription fee.
You are ok with their terms of use.

Stable Diffusion is for you if

You want a completely free solution.
You want to run everything locally.
You are tech-savvy.
You like tinkering with your setup, trying out model combinations, and using new tools.
You need the image-editing capability.
You prefer open-source tools.
You want more control over your images.

I hope this article helps you understand the difference between Midjourney and Stable Diffusion and helps you decide which one to use. If you can afford the time and resources, you should try out both. You will likely find both have their place in your workflow. I use both of them and am often fascinated by the challenge of producing one’s images with the other.

Next Read: Generating Animals using Stable Diffusion Model »

aizmin: