Midjourney has competition — I got access to Google Imagen 3 and it is impressive
Available in ImageFX
Imagen 3 is a text-to-image artificial intelligence model built by Google's advanced AI lab DeepMind. It was announced at Google I/O and is finally rolling out to users.
The model is currently only available through the Google AI Test Kitchen experiment ImageFX and only to a small group of “trusted users” but that pool is being expanded regularly.
With Imagen3 Google promises better detail, richer lighting and fewer artifacts than the previous generations. It also has better prompt understanding and text rendering.
ImageFX is available for any Google user in the U.S., Kenya, New Zealand and Australia. I’ve been given access to Imagen 3 and created a series of prompts to put it to the test.
Creating Imagen 3 prompts
Google DeepMind promises higher-quality images across a range of styles including photorealism, oil paintings and graphic art. It can also understand natural language prompts and complex camera angles.
I fed all this into Claude and had it come up with a bullet list of promised features. I then refined each bullet into a prompt to cover as many areas as possible. The one I’m most excited for is its ability to accurately render text on an image — something few do very well.
1. The wildcard (I’m feeling lucky)
The first prompt is one that ImageFX generated on its own. It automatically suggests an idea and you hit ‘tab’ to see the prompt in full and either adapt or use it to make an image.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
This is also a good way to test one of the most powerful features of ImageFX — its chips. These turn keywords or phrases into menu items where you can quickly adapt elements of an image.
It offered me: “A macro photograph of a colorful tiny gnome riding a snail through a thick green forest, magical, fantasy.” Once generated you can edit any single element of an image with inpainting, this will generate four new versions but only change the area you selected.
I love the way it rendered the background and captured the concept of a macro photograph. It was also incredibly easy to adapt the color of the hat.
2. Dewdrop Web Macro
This prompt aims to test Imagen 3's ability to render microscopic details and complex light interactions in a natural setting. This is similar to the first test but with an additional degree of complexity in the foreground with the dew dropr.
Prompt: "A macro photograph of a dewdrop on a spider's web, capturing the intricate details of the web and the refraction of light through the water droplet. The background should be a soft focus of a lush green forest."
As an arachnophobe, I was worried it would generate a spider but it followed the prompt well enough to just show a portion of the web.
3. Hummingbird Style Contrast
Here the aim is to test the model's versatility in generating contrasting artistic styles within a single image. I initially used the prompt: "Create a split-screen image: on the left, a photorealistic close-up of a hummingbird feeding from a flower; on the right, the same scene reimagined as a vibrant, stylized oil painting in the style of Van Gogh."
This would have worked with Midjourney or similar but not Google. I had to revise this prompt as ImageFX won’t create work in the style of a named artist, even one whose work is long out of copyright.
So I used: “Create a split-screen image: on the left, a photorealistic close-up of a hummingbird feeding from a flower; on the right, the same scene reimagined as a vibrant, stylized painting with bold, swirling brushstrokes, intense colors, and a sense of movement and emotion in every element. The sky should have a turbulent, dream-like quality with exaggerated stars or swirls.”
This is a style I plan to experiment with more as it looks stunning. I'd have adapted the background on one side to better match the 'Van Gogh' style but otherwise it was what I asked for and did a good job at integrating the conflicting styles.
4. Steampunk Market Scene
With this prompt, the aim is to challenge Imagen 3's ability to compose a complex, detailed scene with multiple elements and specific lighting conditions. I gave it some complex elements and descriptions to see how many it would produce.
Prompt: "A bustling steampunk-themed marketplace at dusk. In the foreground, a merchant is demonstrating a brass clockwork automaton to amazed onlookers. The background should feature airships docking at floating platforms, with warm lantern light illuminating the scene."
The first of the four images it generated exactly matched the prompt and the lighting is what you'd expect, suggesting Imagen has a goo understanding of the real world.
5. Textured Reading Nook
Generating accurate or at least compelling textures can be a challenge for models, sometimes resulting in a plastic-style effect. Here we test the model's proficiency in accurately rendering a variety of textures and materials.
Prompt: "A cozy reading nook with a plush velvet armchair, a chunky knit blanket draped over it, and a weathered leather-bound book on the seat. Next to it, a rough-hewn wooden side table holds a delicate porcelain teacup with intricate floral patterns."
Not much to say about this beyond the fact it looks great. What I loved about this one were the options I was offered in 'chips'. It allowed me to easily swap cozy for spacious, airy and bright. I could even change the reading nook to a study, library and living room.
Obviously, you can just re-write the whole thing but these are ideas that work as subtle changes to fit the style.
6. Underwater Eclipse Diorama
The idea behind this prompt was to test Imagen 3's ability to interpret and execute a long, detailed prompt with multiple complex elements and lighting effects.
Prompt: "An underwater scene of a vibrant coral reef during a solar eclipse. The foreground shows diverse marine life reacting to the dimming light, including bioluminescent creatures beginning to glow. In the background, the eclipsed sun is visible through the water's surface, creating eerie light rays that illuminate particles floating in the water."
This is probably the worst of the outputs. It looks fine but the solar eclipse feels out of place and the texture feels 'aquarium' rather than ocean.
7. Lunar Resort Poster
The last few tests all target the model's improved text rendering capabilities. This one asks for a poster and requires Imagen 3 to generate an image within a stylized graphic design context.
Prompt: "Design a vintage-style travel poster for a fictional lunar resort. The poster should feature retro-futuristic art deco styling with the text 'Visit Luna Luxe: Your Gateway to the Stars' prominently displayed. Include imagery of a gleaming moon base with Earth visible in the starry sky above."
Text rendering was as good as I've seen, especially across multiple elements rather than just the headline. The style was OK but not perfect. It did fit the requirement but fell more art-deco than futuristic.
8. Eco-Tech Product Launch
Midjourney is very good at creating product images in the real world. DALL-E is also doing that to a degree with its recent update. Here I'm asking Imagen 3 to create a modern, sleek advertisement with integrated product information.
Prompt: "Design a cutting-edge digital billboard for the launch of 'EcoCharge', a new eco-friendly wireless charging pad. The design should feature a minimalist, high-tech aesthetic with a forest green and silver color scheme. Include a 3D render of the slim, leaf-shaped device alongside the text 'EcoCharge: Power from Nature' and 'Charge your device, Save the planet - 50% more efficient'. Incorporate subtle leaf patterns and circuit board designs in the background."
It did exactly what we asked on the prompt in terms of the style, render, and design. It gt the title and subhead perfectly, and even rendered most of the rest of the text but that wasn't nearly as clear.
9. Retro Gaming Festival
The final test is something I've actively used AI for — making a poster or flyer. Here we're testing its ability to handle a range of styles with multiple text elements.
Prompt: "Create a vibrant poster for 'Pixel Blast: Retro Gaming Festival'. The design should feature a collage of iconic 8-bit and 16-bit era video game characters and elements. The title 'PIXEL BLAST' should be in large, colorful pixel art font at the top. Include the text 'Retro Gaming Festival' in a chrome 80s style font below. At the bottom, add 'June 15-17 • City Arena • Tickets at pixelblast.com' in a simple, readable font. Incorporate scan lines and a CRT screen effect over the entire image."
I'd give it an 8 out of 10. It was mostly accurate with the occasional additional word but every single word was rendered correctly, there were sometimes just too many of them.
Final thoughts
Google Imagen 3's text rendering and realism have matched Midjourney levels. It does refuse to generate more often than Midjourney but that is understandable from a Google product.
Imagen 3 is a huge step up from Imagen 2, which was already a good model. The biggest upgrade seems to be in overall image quality. Generated pictures are better looking, and have fewer artifacts and a lot more detail than I’ve seen from Imagen 2 or other company models.
It will be interesting to see how this works once it is rolled out to other platforms such as Gemini or built into third-party software as a developer API.
However it is finally deployed, DeepMind has done it again with an impressive real application of advanced generative AI and created it in a way that is user-friendly, adaptable and powerful enough for even the pickiest of users.
More from Tom's Guide
- OpenAI is paying researchers to stop superintelligent AI from going rogue
- Exclusive: AI breaktrhough could let your next running shoes learn and adapt to how you move
- Meet Alter3 — the creepy new humanoid robot powered by OpenAI GPT-4
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?