Language Translation Archives

Authors Can Use ElevenLabs Audiobook Narration for Spotify

By Paula Parisi
February 24, 2025

Spotify is boosting its audiobook content by agreeing to accept material narrated using ElevenLabs’ AI voice app. Given that ElevenLabs is currently among the most recognized AI audio providers, this new partnership is expected to boost the quantity of AI-narrated audiobooks on the platform. ElevenLabs content can be distributed to Spotify (and “select other audiobook retailers”) via Spotify’s Findaway Voices platform for indie authors. For $99 per month, authors can generate up to 500 minutes of AI audio startup ElevenLabs’ narration in 29 languages with what Spotify says is “complete control over voice and intonation.” Continue reading Authors Can Use ElevenLabs Audiobook Narration for Spotify

Adobe Firefly Video Now in Public Beta Starting at $10 Month

By Paula Parisi
February 14, 2025

Adobe’s Firefly video is now in public beta as part of Firefly AI, now multi-modal with video, image and vector generation. Available for $10 for Firefly Standard or $30 for Firefly Pro, the Firefly app offers additional tiers for premium video and audio features, offering a degree of customization based on project needs. Adobe continues to position Firefly as “the only generative AI model that is IP-friendly and commercially safe,” offering the option of contractual IP indemnification to protect against infringement lawsuits “in the unlikely event of a claim involving a Firefly output.” Continue reading Adobe Firefly Video Now in Public Beta Starting at $10 Month

CES: Halliday’s AI Smart Glasses Project Directly into the Eye

By Paula Parisi
January 9, 2025

Wearable technology startup Halliday has unveiled smart glasses that beam images directly to the wearer’s eyes. At CES Unveiled, the Shenzhen-based company previewed AI-powered eyewear that that projects images directly into eyes instead of onto a lens and is controlled by a smart ring. The “minimal optical module projection technology,” coined DigiWindow, is being called first-of-its-kind. The device has a “proactive AI assistant” that reacts to its environment without being asked. The frames come in matte black or tortoiseshell and have lenses that can accommodate prescriptions. Continue reading CES: Halliday’s AI Smart Glasses Project Directly into the Eye

YouTube Expands Access to Improved AI-Powered Dubbing

By Paula Parisi
December 12, 2024

Hundreds of thousands more YouTube channels are gaining access to its AI-powered auto-dubbing feature, which generates audio translation tracks for YouTube videos, helping to make the platform’s content more accessible to viewers around the world. The expanded rollout targets informational channels in the Partner Program, such as tutorials on cooking, sewing, tourism and home improvement. Availability “will expand to other types of content soon,” according to video streamer, which began testing the feature with select creators last year. Based on technology developed by Aloud, YouTube’s auto-dubbing emerged from the Area 120 internal incubator program. Continue reading YouTube Expands Access to Improved AI-Powered Dubbing

Bertelsmann and ElevenLabs Team Up to Foster AI Production

By Paula Parisi
December 3, 2024

German media company Bertelsmann has partnered with AI startup ElevenLabs on an effort to drive tech innovation and workflow across Bertelsmann production, marketing and distribution. Bertelsmann operations span roughly 50 countries with businesses including the publisher Penguin Random House, record label BMG and the RTL Group television unit. The objective is for ElevenLabs tools in voice and audio generation to help Bertelsmann expand productivity and reach. In August, New York-based ElevenLabs opened a European headquarters in London, expanding its international footprint for text-to-speech and other audio apps. Continue reading Bertelsmann and ElevenLabs Team Up to Foster AI Production

DeepL Voice Translates 33 Languages to Captions in Real Time

By Paula Parisi
November 15, 2024

DeepL, a German company that gained a profile with online text translation, has released DeepL Voice, a B2B tool that translates to captions in real time. DeepL Voice debuts in two iterations: DeepL Voice for Meetings, which allows participants to speak in their preferred language while serving colleagues with captions, and DeepL Voice for Conversations, which works on mobile devices, facilitating in-person, one-on-one conversations “with customers, colleagues or anyone else, in the language that works best for them,” the company explains, noting that real-time voice translation offers specific challenges. Continue reading DeepL Voice Translates 33 Languages to Captions in Real Time

BodyTalk Dubs into 29 Languages with Facial Moves to Match

By Paula Parisi
November 12, 2024

Panjaya is a AI startup that aims to disrupt the world of video dubbing with a way to generate “hyperrealistic” recreations of a person’s voice speaking a new language. The system also automatically modifies the imagery to match lip and other physical movements to match the new speech patterns. Called BodyTalk, the technique is the launch point for Panjaya as it emerges from the stealth in which it conducted its R&D the past three years, backed by $9.5 million from venture funds and angel backers. The startup describes BodyTalk as “AI dubbing that looks and feels as natural as the original.” Continue reading BodyTalk Dubs into 29 Languages with Facial Moves to Match

ElevenLabs Reader App Is Available Globally in 32 Languages

By Paula Parisi
August 29, 2024

New York-based ElevenLabs is going global with its generative AI text-to-speech reader app, which can narrate writings in 32 languages with thousands of voices from which to choose. The audio startup promises “high quality, human-like” AI voices that are “emotionally and contextually aware,” adapting delivery of written cues “to achieve a high emotional range.” ElevenLabs has focused on “creative workflow,” with a voice isolator and audio effects generator tools. Its catalog includes the voices of celebrities Judy Garland, Laurence Olivier, James Dean and Burt Reynolds. Custom models for translation and voiceover work using contemporary actors is a future possibility. Continue reading ElevenLabs Reader App Is Available Globally in 32 Languages

D-ID Employs AI to Translate Videos into Multiple Languages

By Paula Parisi
August 23, 2024

D-ID, a platform that uses AI to generate digital humans, has announced D-ID Video Translate in general availability. The tool lets businesses and content creators automatically re-voice videos in multiple languages, “cloning the speaker’s voice and adapting their lip movements from a single upload.” D-ID is making the Video Translate tool, which accommodates 30 different languages, free to D-ID subscribers for a limited time, available through the D-ID Studio or the company’s API. Languages include Arabic, Mandarin, Japanese, Hindi and Ukrainian, in addition to Spanish, German, French and Italian. Users can simultaneously translate content using bulk translation. Continue reading D-ID Employs AI to Translate Videos into Multiple Languages

Q2 Report: Reddit Adds Users, Narrows Losses, Preps for AI

By Paula Parisi
August 12, 2024

Reddit will soon add AI-generated summaries atop its search results, co-founder and CEO Steve Huffman told investors on a Q2 earnings call last week. The company will later this year begin testing AI-powered search result pages that “summarize and recommend content,” said Huffman, who expects the technology will help Reddit users “dive deeper” into content and aid in discovery. Huffman mused monetization strategies, including boosting ad inventory and possibly installing a paywall to differentiate premium content. The discussions accompanied news that Reddit lost $10.1 million in Q2, which saw revenues of $281 million, a 54 percent increase year-over-year. Continue reading Q2 Report: Reddit Adds Users, Narrows Losses, Preps for AI

Meta AI Seamless Translator Converts Nearly 100 Languages

By Paula Parisi
December 5, 2023

The research division of Meta AI has developed Seamless Communication, a suite of artificial intelligence models that generate what the company says is natural and authentic communication across languages, facilitating what amounts to real-time universal speech translation. The models were released with accompanying research papers and data. The flagship model, Seamless, merges capabilities from a trio of models — SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 — into a single system that can translate between almost 100 spoken and written languages, preserving idioms, emotion and the speaker’s vocal style, Meta says. Continue reading Meta AI Seamless Translator Converts Nearly 100 Languages

Spotify Uses AI to Copy Host Voices for Podcast Translations

By Paula Parisi
September 27, 2023

Spotify is using AI to drive podcast language translation in what sounds like the podcaster’s own voice, which has obvious implications for film and television dubbing. Working with podcast notables including Dax Shepard, Monica Padman and Bill Simmons, Spotify used AI to mimic their voices in Spanish, French and German for several episodes. The proprietary Spotify technology uses OpenAI’s new text-to-speech voice-generation technology as well as its open-source Whisper speech recognition system, which transcribes spoken words into text. The result, Spotify says, is “more authentic” and “more personal and natural” than traditional dubbing. Continue reading Spotify Uses AI to Copy Host Voices for Podcast Translations

Google Shows Off Impressive Range of AI at NY Media Event

By Paula Parisi
November 4, 2022

Google Research is touting new advances in artificial intelligence, which can now generate its own code and write fiction, in addition to better text-to-video and language translation. At a New York media event at Google’s Pier 57 office — which opened earlier this year to become the company’s third Manhattan outpost — roughly a dozen projects in various stages of development were on display, with robot learning, LaMDA (language model for dialogue applications) and text-generated 3D images sharing the spotlight with practical AI for things like disaster management, weather forecasts and healthcare. Continue reading Google Shows Off Impressive Range of AI at NY Media Event

OpenAI Rolls Out Open-Source Speech Recognition System

By Paula Parisi
September 26, 2022

OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System

AI Is Still a Work in Progress When It Comes to Auto-Dubbing

By Paula Parisi
November 22, 2021

Auto-dubbing, which uses artificial intelligence to translate content into different languages, is a technology on which the global entertainment industry has increasingly come to rely in finding audiences among the planet’s 7.2 billion people, speaking more than 7,000 languages in roughly 200 countries. Companies like Flawless, Deepdub and Papercup use different approaches to offload to computers much of the labor required to fill that distribution pipeline. Another company, Spherex, emphasizes cultural awareness and the need for heightened sensitivity in pursuit of hits that travel across borders. Continue reading AI Is Still a Work in Progress When It Comes to Auto-Dubbing