By
Paula ParisiNovember 15, 2024
DeepL, a German company that gained a profile with online text translation, has released DeepL Voice, a B2B tool that translates to captions in real time. DeepL Voice debuts in two iterations: DeepL Voice for Meetings, which allows participants to speak in their preferred language while serving colleagues with captions, and DeepL Voice for Conversations, which works on mobile devices, facilitating in-person, one-on-one conversations “with customers, colleagues or anyone else, in the language that works best for them,” the company explains, noting that real-time voice translation offers specific challenges. Continue reading DeepL Voice Translates 33 Languages to Captions in Real Time
By
Paula ParisiNovember 12, 2024
Panjaya is a AI startup that aims to disrupt the world of video dubbing with a way to generate “hyperrealistic” recreations of a person’s voice speaking a new language. The system also automatically modifies the imagery to match lip and other physical movements to match the new speech patterns. Called BodyTalk, the technique is the launch point for Panjaya as it emerges from the stealth in which it conducted its R&D the past three years, backed by $9.5 million from venture funds and angel backers. The startup describes BodyTalk as “AI dubbing that looks and feels as natural as the original.” Continue reading BodyTalk Dubs into 29 Languages with Facial Moves to Match
By
Paula ParisiAugust 29, 2024
New York-based ElevenLabs is going global with its generative AI text-to-speech reader app, which can narrate writings in 32 languages with thousands of voices from which to choose. The audio startup promises “high quality, human-like” AI voices that are “emotionally and contextually aware,” adapting delivery of written cues “to achieve a high emotional range.” ElevenLabs has focused on “creative workflow,” with a voice isolator and audio effects generator tools. Its catalog includes the voices of celebrities Judy Garland, Laurence Olivier, James Dean and Burt Reynolds. Custom models for translation and voiceover work using contemporary actors is a future possibility. Continue reading ElevenLabs Reader App Is Available Globally in 32 Languages
By
Paula ParisiAugust 23, 2024
D-ID, a platform that uses AI to generate digital humans, has announced D-ID Video Translate in general availability. The tool lets businesses and content creators automatically re-voice videos in multiple languages, “cloning the speaker’s voice and adapting their lip movements from a single upload.” D-ID is making the Video Translate tool, which accommodates 30 different languages, free to D-ID subscribers for a limited time, available through the D-ID Studio or the company’s API. Languages include Arabic, Mandarin, Japanese, Hindi and Ukrainian, in addition to Spanish, German, French and Italian. Users can simultaneously translate content using bulk translation. Continue reading D-ID Employs AI to Translate Videos into Multiple Languages
By
Paula ParisiAugust 12, 2024
Reddit will soon add AI-generated summaries atop its search results, co-founder and CEO Steve Huffman told investors on a Q2 earnings call last week. The company will later this year begin testing AI-powered search result pages that “summarize and recommend content,” said Huffman, who expects the technology will help Reddit users “dive deeper” into content and aid in discovery. Huffman mused monetization strategies, including boosting ad inventory and possibly installing a paywall to differentiate premium content. The discussions accompanied news that Reddit lost $10.1 million in Q2, which saw revenues of $281 million, a 54 percent increase year-over-year. Continue reading Q2 Report: Reddit Adds Users, Narrows Losses, Preps for AI
By
Paula ParisiDecember 5, 2023
The research division of Meta AI has developed Seamless Communication, a suite of artificial intelligence models that generate what the company says is natural and authentic communication across languages, facilitating what amounts to real-time universal speech translation. The models were released with accompanying research papers and data. The flagship model, Seamless, merges capabilities from a trio of models — SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 — into a single system that can translate between almost 100 spoken and written languages, preserving idioms, emotion and the speaker’s vocal style, Meta says. Continue reading Meta AI Seamless Translator Converts Nearly 100 Languages
By
Paula ParisiSeptember 27, 2023
Spotify is using AI to drive podcast language translation in what sounds like the podcaster’s own voice, which has obvious implications for film and television dubbing. Working with podcast notables including Dax Shepard, Monica Padman and Bill Simmons, Spotify used AI to mimic their voices in Spanish, French and German for several episodes. The proprietary Spotify technology uses OpenAI’s new text-to-speech voice-generation technology as well as its open-source Whisper speech recognition system, which transcribes spoken words into text. The result, Spotify says, is “more authentic” and “more personal and natural” than traditional dubbing. Continue reading Spotify Uses AI to Copy Host Voices for Podcast Translations
By
Paula ParisiNovember 4, 2022
Google Research is touting new advances in artificial intelligence, which can now generate its own code and write fiction, in addition to better text-to-video and language translation. At a New York media event at Google’s Pier 57 office — which opened earlier this year to become the company’s third Manhattan outpost — roughly a dozen projects in various stages of development were on display, with robot learning, LaMDA (language model for dialogue applications) and text-generated 3D images sharing the spotlight with practical AI for things like disaster management, weather forecasts and healthcare. Continue reading Google Shows Off Impressive Range of AI at NY Media Event
By
Paula ParisiSeptember 26, 2022
OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System
By
Paula ParisiNovember 22, 2021
Auto-dubbing, which uses artificial intelligence to translate content into different languages, is a technology on which the global entertainment industry has increasingly come to rely in finding audiences among the planet’s 7.2 billion people, speaking more than 7,000 languages in roughly 200 countries. Companies like Flawless, Deepdub and Papercup use different approaches to offload to computers much of the labor required to fill that distribution pipeline. Another company, Spherex, emphasizes cultural awareness and the need for heightened sensitivity in pursuit of hits that travel across borders. Continue reading AI Is Still a Work in Progress When It Comes to Auto-Dubbing
By
Debra KaufmanNovember 5, 2018
At the IEEE Conference on Visual Analytics Science and Technology in Berlin, IBM and Harvard University researchers presented Seq2Seq-Vis, a tool to debug machine translation tools. Translation tools rely on neural networks, which, because they are opaque, make it difficult to determine how mistakes were made. For that reason, it’s known as the “black box problem.” Seq2Seq-Vis allows deep-learning app creators to visualize AI’s decision-making process as it translates a sequence of words from one language to another. Continue reading IBM, Harvard University Develop New Tool for AI Translation
LinkedIn is introducing two new features: the ability to use QR codes for quickly sharing profiles and contact details, and a “See Translation” button that will translate posts into different languages. Currently available for iOS and Android, the QR codes offer users a quick option for accessing someone’s profile or sharing their own code via messaging apps, email, websites or printed materials such as business cards, conference badges and company brochures. The translation tool, available for more than 60 languages, is offered through LinkedIn’s desktop and mobile web versions (and soon via iOS and Android). Continue reading LinkedIn Unveils Language Translation Tool and QR Codes