Forget Sora — MiniMax is a new realistic AI video generator and it’s seriously impressive
I tried it on a series of prompts
MiniMax is the latest artificial intelligence video generator to come out of China. It is already making waves for its ability to generate hyper-realistic footage of humans, including accurate hand movements. This is something other tools have struggled with.
This is just the latest foray into generative AI for the Alibaba and Tencent-backed unicorn startup. Its AI companion app Talkie has been downloaded over 15 million times and like Character.ai lets you converse with a virtual creation.
The official demo of the app shared on X appears to show the trailer for a magical adventure where a child touches a coin and is transported through history. It features special effects, a consistent character, and realism — all made from just text prompts, AI, and clever editing.
To find out if the actual tool lives up to the hype I signed up for an account, came up with a handful of prompts and started putting it to the test. While it is impressive and up there with Runway Gen-3, Dream Machine and Kling — it isn’t as big of a leap as the video suggests.
What is MiniMax video?
Another Chinese 'Sora': A new AI video tool launched today by Minimax, backed by major investors Alibaba Group and Tencent. 🎞️Check out their official AI film Magic Coin🪙, created entirely with text-to-video . 🥁Try it for free now: https://t.co/Kl1avPXkFL pic.twitter.com/df14ZVq1EsAugust 31, 2024
MiniMax video-01 is the latest in a line of models from the startup including generative speech, language and music generation. It dropped the new video model without fanfare early in September and it quickly blew up on social media in China and in the West.
Founder Yan Junjie told reporters: “We have indeed made significant progress in video model generation, and based on internal evaluations and scores, our performance is better than that of Runway in generating videos.”
The company is already working on version-02 of its video model and plans to continue updating to include image-to-video, text and image-to video and longer initial clip generation.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
The model supports 1280x720 resolution videos at 25 frames per second. Like Kling and Runway, you can describe cinematic camera movements and while it is only six-second clips for now, the plan is to match the 10-seconds of current industry leaders with the next update.
Putting MiniMax video-01 to the text?
I pulled together a range of prompts covering different types of movement, text rendering and a combination of scenes, closeups and camera motion types. I’ve included all the prompts below if you want to try for yourself on the video-01 page.
1. Lightning Storm Over a Futuristic City
The prompt: "A nighttime scene of a towering, futuristic city skyline with sleek, glowing buildings. Sudden, bright flashes of lightning streak across the sky, illuminating the buildings and casting dramatic shadows. The rain begins to pour, and the scene ends with a close-up of raindrops hitting a neon-lit street."
2. A Butterfly Landing on a Water Lily
The prompt: "A peaceful, close-up scene of a calm pond with a blooming water lily in the centre. A delicate butterfly flutters in from the side, gently landing on the flower. The ripples in the water subtly move as the butterfly’s wings slowly close and open, creating a serene, tranquil atmosphere."
3. Spaceship Launch from an Alien Planet
The prompt: "A wide shot of a rugged, alien landscape under a sky filled with two moons. A sleek spaceship, with futuristic design elements, powers up in the foreground, its engines glowing. The ground trembles slightly as the ship lifts off, leaving behind a trail of dust and glowing embers, quickly ascending into the sky."
4. A pride of lions at dusk
The prompt: "A majestic scene in the African savannah at dusk. A pride of lions, with the dominant male in the foreground, is gathered near a watering hole. The lions’ golden fur glows in the fading sunlight as they drink and survey the landscape. The scene ends with the male lion lifting its head and gazing into the distance as the sky turns a deep orange."
5. Vintage Cinema Title Card
The prompt: "A sepia-toned, old-fashioned cinema title card with an ornate border and classic font. The text "Presenting: The Adventures of the Lost Treasure" appears in the centre, accompanied by a subtle film grain effect and the flickering of an old film reel. The title lingers for a moment before fading out, leaving just the grainy background."
6. Girl talking in a cafe
The prompt: "A cosy, retro-style diner with warm, ambient lighting, complete with red leather booths and a classic jukebox in the corner. In the foreground, a young woman in her mid-20s sits at a booth, casually chatting and smiling. She has shoulder-length brown hair, wearing a light blue sweater and jeans. She is animated, gesturing with her hands as she talks, conveying a sense of enthusiasm and engagement."
7. The northern lights
The prompt: "A breathtaking Arctic night where the sky comes alive with the dazzling display of the Northern Lights. Waves of green, purple, and blue auroras dance across the sky, shifting and swirling in a mesmerising rhythm. Snow-covered mountains stand majestically in the background, their peaks illuminated by the ethereal glow."
Final thoughts
MiniMax video-01 is a good model, roughly equivalent to Luma Labs Dream Machine but not as good as Runway Gen-3, despite what the CEO claims.
The other big Chinese video model, also available in the west is Kling and it is leaps and bounds ahead of the clips I generated with MiniMax. It also has a wider feature set including 10-second clips, longer-generating pro mode and image-to-video.
However, MiniMax does seem to have captured generating human movement well and promises this is just the first version, with a follow-up in weeks not months, so this is definitely an AI video generators to keep an eye on.
More from Tom's Guide
- Apple is bringing iPhone Mirroring to macOS Sequoia — here’s what we know
- iOS 18 supported devices: Here are all the compatible iPhones
- Apple Intelligence unveiled — all the new AI features coming to iOS 18, iPadOS 18 and macOS Sequoia
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?