Midjourney recently released an AI image generator that rivals Dall-E in terms of accuracy and resolution. In this article, we're going to take a look at what the Midjourney AI can do, and how it compares to OpenAI's industry incumbent.
From their website Midjourney doesn't look too impressive. It's just a simple card with a relatively generic business mission:
But hiding under the Apply for the Beta button, there's some seriously impressive (beta) AI that can generate images based on text prompts.
Several months ago, OpenAI made waves with their Dall-E 2 AI system, that can generate photorealistic images based off text input.
Not only can Dall-E generate images, but it can recreate artistic versions from originals, and insert objects into images, taking into account lighting and shadow conditions.
Here we can see the improvement of Dall E 1 to Dall E 2. This is impressive, to say the least. But OpenAI has been very limited with access to this algorithm, for good reason. It can make pretty realistic "fake" images, so the company is worried about misuse.
But the 99.9% of people who just want to experiment are out of luck if they want to access Dall E 2.
And that's where the Midjourney Image Generator AI comes into play.
This is another AI image generator that generates images from text, but it's much more accessible and public. The images it generates have similar resolutions to the example Dall-E 2 images.
However, it can't currently insert into existing images. But other than that, it's unbelievably powerful.
Let's use that Fox example that Open AI touted in an earlier screenshot. The prompt was "a painting of a fox sitting in a field at sunrise in the style of Claude Monet”. Open AI is great, but so is MidJourny, pictured below:
And here's that popular Avocado Chair image that went viral a few years ago.
We recreated it with Midjourney, and the results were almost as good. The OpenAI version looks a bit more photorealistic, but Midjourney isn't to far off.
The UI/UX choice is an interesting one. Midjourney uses a Discord bot to generate images (note this is still early beta and will probably change. Paying subscribers already get access to a webapp). All you need to do is generate an image from a prompt, like so, and it will begin generating:
It takes about 1 minuite to get to a final result. You get 4 options, which you can download from there, or you can choose to upscale one version for even more clarity.
Here's 4 options generated from a prompt OpenAI used: "a male mannequin dressed in an orange and black flannel shirt and black jeans".
And from these options, we can generate a "max upscale" version of 1. This takes much longer than 1 minuite.
Compared to DallE, Midjourney images look a lot more like art, and not incredibly photorealistic. There are hundreds of images that I can see being generated in real time by 1000's of active users on the Discord (there are some pretty funny and wild prompts), and pretty much none of them meet the standard of photorealism.
It's due to this and the fact that you can't minipulate existing images that's likely the reason that Midjoruny beta user access requests are much easier to come by, and why everything's public in a Discord.
If you use it as an art creation tool, I've seen some crazy concepts generated.
But sadly, more realistic images don't look like they're possible with Midjourney, as opposed to Dall-E. This was supposed to be a realistic rendering of a Macbook Pro on a white background. It got the texture right, but nothing else was close:
Midjourney is an incredible AI image generator that's much more accessible to everyday users than OpenAI's DallE, where beta invites are far and few. In terms of generating artistic images and scenes, MidJourny rivales all other options out there.
However, if you'll looking for more realistic images, the ability to use reference images, or an easy way to manipulate existing images, you'll need to wait for Midjourney to add those features or Dall-E to become more accessible.