StabilityAI reveals Stable Diffusion 3 — it does for AI images what Sora is doing for video

AI generated image by Stable Diffusion 3
(Image credit: StabilityAI/AI generated)

Stable Diffusion 3, the next generation of the popular open source AI image generation model has been unveiled by StabilityAI and it is an impressive leap forward.

Details of the new model were revealed alongside a series of image and prompts showing it is capable of following complex instructions and creating hyper realistic images.

This early preview of the model is only available to select group of testers while StabilityAI gathers feedback to improve performance and safety before a public release.

StabilityAI also used the Spawning "Do Not Train" registry to ensure that images from artists that did not want their work used to train AI was excluded. Over 1.5 billion images were filtered from the dataset before training.

What is Stable Diffusion 3?

Stable Diffusion 3 generated image

(Image credit: StabilityAI)

Unlike DALL-E, MidJourney or Google's Imagen Stable Diffusion is an open model that can be integrated into other platforms or even run locally if you have enough compute power. 

SD3 will include a suite of models ranging from 800 million to eight billion paramaters allowing for different levels of quality and for operation on a wide range of hardware devices. 

Like OpenAI's Sora Stable Diffusion 3 combines the diffusion model technology with the transformer architecture which could explain the improved instruction following capabilities. 

It also uses flow matching which is a mathematical technique used to train diffusion models and involves measuring the difference betwern the real world images and the generated images at different stages of the process.

What can Stable Diffusion 3 do?

SD3 generated image

The prompt for this image was followed almost exactly. It was: “Photo of a red sphere on top of a blue cube. Behind them is a green triangle, on the right is a dog, on the left is a cat”. (Image credit: StabilityAI)

Few people outside of the development team have had direct access to Stable Diffusion 3 yet and the research paper has yet to be published, so what we know of its abilities are what the team have said and the output they have shared. 

From what I can see of the images so far, it is a significant step change in generative images. It, alongside OpenAI's Sora, is an indication of a major upgrade in the way generative AI works and how well it works.

It appears to create consistent, extended and legible text on images, solves the problems around human anatomy including fingers, and captures color well.

Emad Mostaque, founder of StabilityAI said StabilityAI has 100x fewer resources for training AI models than the likes of OpenAI but are still achieving impressive work. He suggested that, like Sora, SD3 will be able to accept a range of inputs including video and image.

Details of SD3 come a few days after StabilityAI also unveiled Stable Cascade, a new technique for generating images that Mostaque says will work with SD3 in future.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 61 deals
Filters
Arrow
Load more deals
Ryan Morrison
AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?