I gave four AI image generators a 'realism test' — and the winner surprised me
Ideogram, Flux, Stable Diffusion and Imagen 3 compared
Over 34 billion (yes, with a B) AI images are created every day, according to What's the Big Data. So bad is the commodification of the market, that many AI image generators are now free of charge. If those numbers sound crazy, it’s because they are. Especially when you remember the market only started properly less than two years ago.
Meanwhile, this bonanza is a huge win for users across the world. Image quality has rocketed upwards at the same time as prices have plummeted. So we thought it was time to take a look at four random generators, including two we skipped during our last 7-way competition.
Here we're testing Imagen 3 from Google DeepMind, Flux from Black Forest Labs, Ideogram 2.0 and, as a reference, newer versions of the veteran open-source StableDiffusion models.
We ran four prompts as a test, to see how the four techs compare head to head.
- A modern rainy street market in 2024 New York with stalls selling food and antiques, a young man in a bomber jacket is buying something from a stall
- A fashion photo of a beautiful penthouse apartment in San Francisco with expensive modern furniture and breathtaking views of the Bay.
- A pretty young lady is sitting in an English country garden, she is sitting at a table with a birthday cake on it, and her family happily standing around to celebrate her special day.
- A photo of a group of majestic elephants walking past some huts in the African savanna. A few villagers are sitting and standing watching the elephants pass by.
Ideogram 2
Ideogram continues to impress mightily with its excellent image quality, and above all text manipulation. For a long time, it was the only game in town if you wanted to generate an AI image with coherent text.
Times have changed, and more platforms now offer good quality text, but Ideogram 2.0 promises to raise the bar once more. It still suffers from a few glitches now and then, but overall the quality of the images are superb. Ideogram won our last 7-round test.
The prompt adherence rocks, colors and detail are top class, and the overall impression is extremely professional. Surprisingly, our test prompts didn't trigger any text elements at all, despite the other products adding them into the mix. No worries, we know how good Ideogram is with text by now.
Imagen 3
Google just dropped the latest greatest version of its Imagen 3 AI image generator model, and suddenly there’s a real tussle going on between the big image makers.
It is, however, fair to say that the big G continues to play catch up in the AI space, despite being one of the true pioneers of the artificial intelligence arena.=
We’ve already covered the Imagen 3 basics before, and the new version is a worthy successor. The results were pretty good, although not standout in terms of quality. However it’s really disappointing to report that despite some great results, the generator was let down by at least one unbelievable moderation flaw.
It refused to deliver an image featuring a party in a garden. No matter what we tried, it refused for reasons of...who knows? The nearest it could get was a pitiful resolution image of a cake on a dimly lit table. Really Google?
Flux (Schnell)
Flux is the huge surprise on the block. Surprise because it arrived out of nowhere, because it’s open source and because it’s absolutely brilliant at producing AI images.
What is not so surprising is the fact that the development team comes out of the original StableDiffusion crew. So oodles of legacy expertise at play.
The Flux model we used (via fluximagegenerator.net) was Schnell, which is one of the three on offer (the others being Dev and Pro).
As we said, the images were uniformly amazing, both in terms of the coherence of the image structure and the quality itself. The prompt adherence, the image resolution and the lack of dodgy fingers, faces and text all stand out, and declare a new image master has arrived.
Stable Diffusion (SDXL)
It may be getting old, but the StableDiffusion model family is the gift that keeps on giving. It’s the most popular image generator by far (over 12 billion images created thus far), and just when you think it’s on its last legs, up pops a new LoRA or fine tune which delights. We ran some tests with my much loved Krita Diffusion AI installation, and the results were surprisingly solid.
To get the best out of the models, you’ll need to fiddle around with add-on LoRAs for things like faces and fingers, but once you find a combo that works, it really does hold its own against the newer tools. Of course it’s pretty hit and miss in terms of text generation, so you’ll have to put up with a few glitches here and there. Which is where the fine tuned models like Ideogram 2.0 come into play.
Winner: Flux (Schnell)
To say we users are spoiled for choice is an understatement. Not only has the image generation market exploded into public use, but the quality and price of the products have been continually improving as the technology matures. What’s even better is the fact that it’s not all proprietary tech leading the way. The free open source products are not only holding their own, but in the case of Flux, leading the way. It’s a glorious time to be alive.
The two main surprises from this quick roundup are the continued strength of open source and older products like StableDiffusion SDXL, and the embarrassingly bad showing by Google, yet again. To glitch this badly in this kind of market displays a deep problem in its AI development team. Pretty unfathomable to be honest.
More from Tom's Guide
- Best food delivery services: Grubhub vs Uber Eats vs Doordash
- The best cast iron skillets 2024: Tested and rated
- How to make images with AI using Leonardo
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.
He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.