AI video just took a big leap forward — Pika Labs adds lip syncing
Lip Syncing brings voice to AI video
Pika Labs, one of the leading AI video platforms, has added a new feature that can bring voice to generated characters.
Lip Sync was built in partnership with AI audio platform ElevenLabs and lets you give words to people in generated videos and sync their lip movements to the sound.
Film makers wanting to have characters in their generated video holding a conversation would have to accept them not having lip movement, or intersect real actors with generated clips.
Lip Sync changes that. The new tool is a significant moment in the generative AI video space, which itself is barely a year old. I'd argue when properly deployed and initial issues ironed out, it is as big of a moment as the launch of OpenAI's Sora.
What is Lip Sync from Pika Labs
Until now most artificial intelligence generated video clips have been just that, clips showing a scene, a person or a situation. They haven’t had the interactivity of a character speaking to the camera or to someone else on screen.
Without the ability to have realistic characters speaking to the audience most videos have been glorified slideshows or used for music videos.
I've done both, also made fictional trailers for TV shows or commercials — all using voice over rather than giving specific characters a voice in the video.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
[New] Trying Pika lip sync. so cool. pic.twitter.com/5N0f9vxhBZFebruary 27, 2024
I haven't tried Lip Sync myself yet, as it's currently only available to users subscribed to the Pro plan or above, but from what I've seen of others generations, it isn't perfect but very close to being production ready. At the very least it will present a cheap way to get a pilot off the ground quickly.
The feature can take text-to-audio with the voice provided by ElevenLabs, or a direct audio upload if you've already got your own sound — such as a podcast or book.
Similar functionality is already available from tools like Synthesia but that has a more enterprise customer service focus and generates talking heads rather than characters.
Why is Lip Sync in AI videos a big deal?
🫶 pic.twitter.com/Rc6TDxrrc6February 27, 2024
Runway and Pika Labs have been the dominant platforms for true generative video for the past few months. Early to market and iterating quickly, with Runway revealing its synthetic voice-over service last year — but not synched to video.
Competition is starting to heat up though with all the big players exploring generative video and OpenAI revealing its very impressive Sora AI video platform.
StabilityAI also has a new version of Stable Video Diffusion and Leonardo is offering motion for any of its AI generated images. Google has Lumiere and Meta has Emu, forcing the early players to add new features before everyone else catches up.
What comes next?
Up until now we've seen silos in generative AI. Tools that make images, tools that create videos, services for writing a script and something else to add sound. The next step will be greater levels of convergence, with platforms emerging offering full end-to-end production from a simple text prompt.
ElevenLabs is also working on a sound effects library, and combined with Suno we could soon see a single platform where you can say "take this script written by ChatGPT and turn it into a short film".
A few minutes later you'd have a timeline with a series of videos, parts spoken by characters using ElevenLabs synthetic voices and appropriate sound effects and music playing to bring the full production to life.
There was concern we'd see AI turn into Skynet and control our lives, but the evidence (so far) seems to suggest it just wants to entertain.
More from Tom's Guide
- Pika Labs new generative AI video tool unveiled — and it looks like a big deal
- I got access to Pika Labs new AI video tool and couldn't believe the quality of the videos it produced
- Runway vs Pika Labs — which is the best AI video tool?
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?