By
Paula ParisiDecember 18, 2024
Meta has added new features to Ray-Ban Metas in time for the holidays via a firmware update that make the smart glasses “the gift that keeps on giving,” per Meta marketing. “Live AI” adds computer vision, letting Meta AI see and record what you see “and converse with you more naturally than ever before.” Along with Live AI, Live Translation is available for Meta Early Access members. Translation of Spanish, French or Italian will pipe through as English (or vice versa) in real time as audio in the glasses’ open-ear speakers. In addition, Shazam support is added for users interested in easily identifying songs. Continue reading Ray-Ban Meta Gets Live AI, RT Language Translation, Shazam
By
Paula ParisiDecember 12, 2024
Hundreds of thousands more YouTube channels are gaining access to its AI-powered auto-dubbing feature, which generates audio translation tracks for YouTube videos, helping to make the platform’s content more accessible to viewers around the world. The expanded rollout targets informational channels in the Partner Program, such as tutorials on cooking, sewing, tourism and home improvement. Availability “will expand to other types of content soon,” according to video streamer, which began testing the feature with select creators last year. Based on technology developed by Aloud, YouTube’s auto-dubbing emerged from the Area 120 internal incubator program. Continue reading YouTube Expands Access to Improved AI-Powered Dubbing
By
Paula ParisiNovember 12, 2024
Panjaya is a AI startup that aims to disrupt the world of video dubbing with a way to generate “hyperrealistic” recreations of a person’s voice speaking a new language. The system also automatically modifies the imagery to match lip and other physical movements to match the new speech patterns. Called BodyTalk, the technique is the launch point for Panjaya as it emerges from the stealth in which it conducted its R&D the past three years, backed by $9.5 million from venture funds and angel backers. The startup describes BodyTalk as “AI dubbing that looks and feels as natural as the original.” Continue reading BodyTalk Dubs into 29 Languages with Facial Moves to Match
By
Paula ParisiJune 27, 2024
Synthesia, which uses AI to create business avatars for use in content such as training, presentation and customer service videos, has announced a major platform update. “Coming soon” with Synthesia 2.0 are full-body avatars that include hands capable of a wide range of motions. Users can animate motion using skeletal sequences on which the persona selected from the catalog can then be automatically mapped. Starting next month, the Nvidia-backed UK company will offer the ability to incorporate brand identity — including typography, colors and logos — into templated videos. A new translation tool automatically applies updates to all languages. Continue reading Lifelike AI Avatars to Get New Features with Synthesia Update
By
ETCentric StaffApril 2, 2024
OpenAI has debuted a new text-to-voice generation platform called Voice Engine, available in limited access. Voice Engine can generate a synthetic voice from a 15-second clip of someone’s voice. The synthetic voice can then read a provided text, even translating to other languages. For now, only a handful of companies are using the tech under a strict usage policy as OpenAI grapples with the potential for misuse. “These small scale deployments are helping to inform our approach, safeguards, and thinking about how Voice Engine could be used for good across various industries,” OpenAI explained. Continue reading OpenAI Voice Cloning Tool Needs Only a 15-Second Sample
By
Paula ParisiOctober 17, 2023
Captions, which leverages AI to help its customers produce “studio quality videos directly from their mobile devices,” has launched a new app called Lipdub that automatically translates and dubs content into 28 languages. The free download lets user dub anyone “and experience familiar voices and faces in a suite of new languages.” Lipdub’s translations not only duplicate what the company says is “the subject’s exact voice,” but also syncs lip movements to match. It also incorporates dialects and idioms, with options like Gen Z and Texas slang. Continue reading Captions Debuts AI Lipdub with Translation and Gen Z Slang
By
Paula ParisiAugust 24, 2023
Meta Platforms is releasing SeamlessM4T, the world’s “first all-in-one multilingual multimodal AI translation and transcription model,” according to the company. SeamlessM4T can perform speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations for up to 100 languages, depending on the task. “Our single model provides on-demand translations that enable people who speak different languages to communicate more effectively,” Meta claims, adding that SeamlessM4T “implicitly recognizes the source languages without the need for a separate language identification model.” Continue reading Meta’s Multimodal AI Model Translates Nearly 100 Languages
By
Paula ParisiJuly 14, 2023
As a next step in its advances in ethical AI, Adobe has announced its Firefly generative AI platform now supports text prompts in more than 100 international languages. The company says its Firefly AI app has generated over one billion images in Firefly and Photoshop since implementation in March. Adobe has also deployed artificial intelligence in Express, Illustrator and the Creative Cloud. Positioning its latest news as an expansion of global proportions, Adobe’s generative AI products will now support text prompts in native dialects in the standalone Firefly web service, with localization coming to more than 20 additional languages. Continue reading Adobe Pursues Ethical, Responsible AI in the Creative Space
By
Paula ParisiMarch 1, 2023
Following several months of tests, YouTube is launching is multi-language audio track feature worldwide, with popular vlogger MrBeast helping to promote the new feature’s benefits. MrBeast, who has over 135 million global subscribers, is hoping to attract new subscribers to his channel now that the most popular videos are dubbed into 11 different languages. The multi-language audio feature allows creators to dub new and existing videos. YouTube says more than 3,500 multi-language videos have been uploaded to the site in 40-plus languages since January of this year. Continue reading YouTube Introduces Multi-Language Audio Tracks Worldwide
By
Paula ParisiFebruary 17, 2023
IT pros are grappling with the ways ChatGPT can be worked into the enterprise stack. The generative artificial intelligence from OpenAI has demonstrated the ability to compile reports, craft marketing pitches and write software code, which makes it seem convenient for business use. Yet concerns remain, including potential security risks and sometimes erratic or inappropriate data feedback. In the past week, one third-party tester had ChatGPT pledge love for its interlocutor, while another received a detailed lecture on why cow eggs are bigger than chicken eggs. Continue reading Business World Asks if Generative AI is Ready for Enterprise
By
Paula ParisiNovember 22, 2021
Auto-dubbing, which uses artificial intelligence to translate content into different languages, is a technology on which the global entertainment industry has increasingly come to rely in finding audiences among the planet’s 7.2 billion people, speaking more than 7,000 languages in roughly 200 countries. Companies like Flawless, Deepdub and Papercup use different approaches to offload to computers much of the labor required to fill that distribution pipeline. Another company, Spherex, emphasizes cultural awareness and the need for heightened sensitivity in pursuit of hits that travel across borders. Continue reading AI Is Still a Work in Progress When It Comes to Auto-Dubbing
By
Paula ParisiOctober 18, 2021
Artificial intelligence and machine learning are poised to revolutionize the dubbing process for media content, optimizing it for a more natural effect as part of an emerging movement called “auto-dubbing.” AI has impacted the way U.S. audiences are experiencing the Netflix breakout “Squid Game” and other foreign content, as well as helping U.S. programming play better abroad. Its impact is in its nascency. Soon, replacing rubber-lip syndrome with AI-enhanced visuals that enable language translation at the click of a button may become the industry norm. Continue reading AI-Powered Auto-Dubbing May Soon Become Industry Norm
By
Paula ParisiOctober 14, 2021
Microsoft and Nvidia have trained what they describe as the most powerful AI-driven language model to date, the Megatron-Turing Natural Language Generation model (MT-NLG), which has “set the new standard for large-scale language models in both model scale and quality,” the firms say. As the successor to the companies’ Turing NLG 17B and Megatron-LM, the new MT-NLG has 530 billion parameters, or “3x the number of parameters compared to the existing largest model of this type” and demonstrates unmatched accuracy in a broad set of natural language tasks. Continue reading Microsoft and Nvidia Debut World’s Largest Language Model
By
Debra KaufmanJanuary 16, 2020
Several translation gadgets made a showing at CES 2020, among them the Ambassador, released last November from Brooklyn-based Waverly Labs, an over-the-ear gadget aimed at travelers. Pocketalk is a translation device that’s popular in Japan and will soon arrive in the U.S. TranslateLive’s ILA Pro adds a subscription-based service for real-time translation. Langogo Minutes is a device that records up to seven hours of audio and provides written transcripts of what it hears. And the WT2 Plus from Timekettle is a multi-language translator in the form of earbuds. Continue reading Variety of Real-Time Translation Devices Showcased at CES
By
Debra KaufmanSeptember 13, 2018
Facebook’s Rosetta is a machine learning system that extracts text in many languages from over one billion images in a real time. Facebook built its own optical character recognition system that can process such huge amount of content, day in and day out. In a recent blog post, Facebook explained how Rosetta works, using a convolutional neural network to recognize and transcribe text, even non-Latin alphabets and non-English words. The system was trained with a mix of human- and machine-annotated public images. Continue reading Facebook Adds 24 Languages to Rosetta Translation Feature