Pika Taps ElevenLabs Audio App to Add Lip Sync to AI Video

On the heels of ElevenLabs’ demo of a text-to-sound app unveiled using clips generated by OpenAI’s text-to-video artificial intelligence platform Sora, Pika Labs is releasing a feature called Lip Sync that lets its paid subscribers use the ElevenLabs app to add AI-generated voices and dialogue to Pika-generated videos and have the characters’ lips moving in sync with the speech. Pika Lip Sync supports both uploaded audio files and text-to-audio AI, allowing users to type or record dialogue, or use pre-existing sound files, then apply AI to change the voicing style.

Pika shared news of the new feature this week in a post on X, touting early access available now at Pika.art for Pro users who are paying $58 per month or $696 per year for the premium offering. The feature is also available to “members of Pika’s ‘Super Collaborators’ invitation-only program available through its Discord group,” VentureBeat reports.

“The feature can take text-to-audio with the voice provided by ElevenLabs, or a direct audio upload if you’ve already got your own sound — such as a podcast or book,” writes Tom’s Guide.

“While Pika’s AI-generated videos remain arguably lower quality and less ‘realistic’ than the ones shown off by OpenAI’s Sora or even another rival AI video generation startup, Runway, the addition of the new Lip Sync feature puts it ahead of both in offering capabilities disruptive to traditional filmmaking software,” VentureBeat opines, noting “most other leading AI video generators don’t yet currently offer a similar feature natively.”

The third-party tool Synthesia offers sync functionality, “but that has a more enterprise customer service focus and generates talking heads rather than characters,” according to Tom’s Guide.

VentureBeat suggests the end-result of such add-on tools has typically resulted in a video with “a ‘low budget,’ Monty Python-esque quality.”

Pika, which built Lip Sync “in conjunction with AI audio platform ElevenLabs,” writes PetaPixel, now “allows creators to give words to people in AI videos and sync their lip movements to the desired speech.” PetaPixel calls the result a “leap forward in AI video as the technology increasingly matures,” and “now seems inevitable that AI video will at some point reach a similar fidelity to AI images.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.