Audio Archives - ETCentric

Character.AI Introduces New Video Generator in Closed Beta

By Paula Parisi
April 24, 2025

Character.AI, a platform offering AI chatbots for socializing and role play, has released a video generation model called AvatarFX in closed beta. Promising the ability to make photorealistic images “come to life — speak, sing and emote — all with the click of a button,” the technology combines audio and video to create a variety of visual style and voice, from realistic 3D — including “non-human faces (like a favorite pet)” — to 2D animations, according to the company. AvatarFX also has the ability “to maintain strong temporal consistency with face, hand and body movement” and can “power videos with multiple speakers.” Continue reading Character.AI Introduces New Video Generator in Closed Beta

Instagram ‘Edits’ Video App Is Released for iOS and Android

By Paula Parisi
April 24, 2025

Instagram has released a standalone video editing tool called Edits that is being described as a full-fledged suite that also has camera capabilities. The resulting content can be released on any social platform, not just those from Meta Platforms, though an Instagram account is required to access Edits. Available worldwide for iOS and Android, Edits is positioned as a way for social videographers to level-up their Instagram or Facebook Reels, but also as a tool for professionals who want a simple mobile solution for short-form videos. Edits also offers analytics so creators can see how their work is performing. Continue reading Instagram ‘Edits’ Video App Is Released for iOS and Android

Vertex AI Movie Studio Can Create Videos from Start to Score

By Paula Parisi
April 14, 2025

Among the many tech advancements unveiled at Google Cloud Next include a major generative media upgrade to Vertex AI, Google Cloud’s managed AI development platform. The new Vertex AI Media Studio lets enterprise users generate complete videos from scratch using text prompts. Lyria, Google’s text-to-music model is now available on Vertex in private preview. Both are subject to an “allowlist.” Chirp 3 now creates custom voices with just 10 seconds of audio input, while Imagen 3 has gained improved abilities for reconstructing missing or damaged portions of an image. Continue reading Vertex AI Movie Studio Can Create Videos from Start to Score

Netflix Expands Dubbing and Subtitle Options to 30 Languages

By Paula Parisi
April 4, 2025

Netflix has gone multilingual, adding a feature that lets viewers choose from a list of more than 30 languages for dubbing or subtitles on any title. The option has previously only been available via mobile and Web browsers, with TV options limited to a handful of choices deemed relevant based on geographic location. Referencing some of its most popular programming — such as South Korea’s “Squid Game,” Spain’s “Berlin” and France’s “Lupin” — Netflix explains, “we know that language availability is what helped these stories and characters find fans beyond their country of origin.” Continue reading Netflix Expands Dubbing and Subtitle Options to 30 Languages

Patreon Signs Podcasting Deals with Wondery and Sony Music

By Paula Parisi
April 4, 2025

Patreon, a subscription platform popular among individual creators and small companies, is expanding beyond boutique service with a network initiative that has inked Wondery and Sony Music Entertainment to podcasting deals. Patreon says podcasting is its largest category, with participants earning more than $472 million from over 6.7 million paid memberships. The figure marked a 35 percent increase from 2023. With more than 100 million total memberships, Patreon says it is “the best place on the Internet for independent podcasters and media networks alike.” The 12-year-old company provides tools for creators to connect directly with fans. Continue reading Patreon Signs Podcasting Deals with Wondery and Sony Music

Nvidia Forges AI Initiative to Streamline Production Workflows

By Douglas Chan
March 31, 2025

During Nvidia’s GTC AI Conference in San Jose earlier this month, VP and GM of Media & Entertainment Richard Kerris presented the Nvidia Media2 initiative that builds on the company’s Blackwell GPU foundation to enable real-time AI solutions for all aspects of media production workflows. His talk showcased a broad range of generative AI breakthroughs in real-time ray tracing and VFX, video search and summarization, and musically-based sound effects (SFX). Kerris also shared insights on the media industry’s reception to AI thus far and humbly implored the audience to consider using such technology as an effective new tool for storytelling. Continue reading Nvidia Forges AI Initiative to Streamline Production Workflows

Alibaba’s Powerful Multimodal Qwen Model Is Built for Mobile

By Paula Parisi
March 28, 2025

Alibaba Cloud has released Qwen2.5-Omni-7B, a new AI model the company claims is efficient enough to run on edge devices like mobile phones and laptops. Boasting a relatively light 7-billion parameter footprint, Qwen2.5-Omni-7B understands text, images, audio and video and generates real-time responses in text and natural speech. Alibaba says its combination of compact size and multimodal capabilities is “unique,” offering “the perfect foundation for developing agile, cost-effective AI agents that deliver tangible value, especially intelligent voice applications.” One example would be using a phone’s camera to help a vision impaired-person navigate their environment. Continue reading Alibaba’s Powerful Multimodal Qwen Model Is Built for Mobile

Google Debuts Next-Gen Reasoning Models with Gemini 2.5

By Paula Parisi
March 27, 2025

Google has released what it calls its most intelligent AI model yet, Gemini 2.5. The first 2.5 model release, an experimental version of Gemini 2.5 Pro, is a next-gen reasoning model that Google says outperformed OpenAI o3-mini and Claude 3.7 Sonnet from Anthropic on common benchmarks “by meaningful margins.” Gemini 2.5 models “are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy,” according to Google. The new model comes just three months after Google released Gemini 2.0 with reasoning and agentic capabilities. Continue reading Google Debuts Next-Gen Reasoning Models with Gemini 2.5

Google Launches Agentspace in the UK and Promotes Chirp 3

By Paula Parisi
March 19, 2025

Google is expanding its AI presence in the UK market, hosting a splashy launch event there for Agentspace. Google in December launched Agentspace, an AI agent hub that makes it easy for enterprises to build, manage and deploy custom agents using Gemini. The gathering was hosted by Google DeepMind CEO Demis Hassabis, and Google Cloud CEO Thomas Kurian and included participation by local customers BT Group and advertising powerhouse WPP. Google invited UK businesses to store cloud data locally using its $1 billion data center, opening there this year. The company also promoted its new Chirp 3 audio generator, which offers HD voice synthesis. Continue reading Google Launches Agentspace in the UK and Promotes Chirp 3

Baidu Releases New LLMs that Undercut Competition’s Price

By Paula Parisi
March 18, 2025

Baidu has launched two new AI systems, the native multimodal foundation model Ernie 4.5 and deep-thinking reasoning model Ernie X1. The latter supports features like generative imaging, advanced search and webpage content comprehension. Baidu is touting Ernie X1 as of comparable performance to another Chinese model, DeepSeek-R1, but says it is half the price. Both Baidu models are available to the public, including individual users, through the Ernie website. Baidu, the dominant search engine in China, says its new models mark a milestone in both reasoning and multimodal AI, “offering advanced capabilities at a more accessible price point.” Continue reading Baidu Releases New LLMs that Undercut Competition’s Price

$7.5M Funds NYU’s Sony Audio Institute, Opening This Spring

By Paula Parisi
March 13, 2025

Sony Corporation has launched the Sony Audio Institute at NYUs Steinhardt School of Culture, Education and Human Development, focusing on innovation in the business and technology of music. Opening this spring, the Sony Audio Institute will serve as an interdisciplinary collaboration that brings together the expertise of Sony’s professional and consumer audio businesses and their leading-edge technologies with NYU students, facilities and faculty. The institute opens with NYU Steinhardt Music Business Program Director Larry Miller at the helm. Miller will focus on the new outfit’s operations full time beginning this fall. Continue reading $7.5M Funds NYU’s Sony Audio Institute, Opening This Spring

Authors Can Use ElevenLabs Audiobook Narration for Spotify

By Paula Parisi
February 24, 2025

Spotify is boosting its audiobook content by agreeing to accept material narrated using ElevenLabs’ AI voice app. Given that ElevenLabs is currently among the most recognized AI audio providers, this new partnership is expected to boost the quantity of AI-narrated audiobooks on the platform. ElevenLabs content can be distributed to Spotify (and “select other audiobook retailers”) via Spotify’s Findaway Voices platform for indie authors. For $99 per month, authors can generate up to 500 minutes of AI audio startup ElevenLabs’ narration in 29 languages with what Spotify says is “complete control over voice and intonation.” Continue reading Authors Can Use ElevenLabs Audiobook Narration for Spotify

Adobe Firefly Video Now in Public Beta Starting at $10 Month

By Paula Parisi
February 14, 2025

Adobe’s Firefly video is now in public beta as part of Firefly AI, now multi-modal with video, image and vector generation. Available for $10 for Firefly Standard or $30 for Firefly Pro, the Firefly app offers additional tiers for premium video and audio features, offering a degree of customization based on project needs. Adobe continues to position Firefly as “the only generative AI model that is IP-friendly and commercially safe,” offering the option of contractual IP indemnification to protect against infringement lawsuits “in the unlikely event of a claim involving a Firefly output.” Continue reading Adobe Firefly Video Now in Public Beta Starting at $10 Month

ByteDance’s AI Model Can Generate Video from Single Image

By Paula Parisi
February 6, 2025

ByteDance has developed a generative model that can use a single photo to generate photorealistic video of humans in motion. Called OmniHuman-1, the multimodal system supports various visual and audio styles and can generate people doing things like singing, dancing, speaking and moving in a natural fashion. ByteDance says its new technology clears hurdles that hinder existing human-generators — obstacles like short play times and over-reliance on high-quality training data. The diffusion transformer-based OmniHuman addressed those challenges by mixing motion-related conditions into the training phase, a solution ByteDance researchers claim is new. Continue reading ByteDance’s AI Model Can Generate Video from Single Image

YouTube Premium Offers Speed Controls and Improved Audio

By Paula Parisi
January 27, 2025

YouTube is rolling out new experimental features for Premium users and letting those paid plan subscribers access more than one test feature at a time. Among the exploratory features now available to YouTube Premium users is high-quality 256kbps audio on music videos and the ability to “jump ahead” on the web, something previously available only on mobile devices. For iOS users, picture-in-picture and smart downloads for YouTube Shorts are also among the new features. In addition, the company announced bundled pricing for those users who subscribe to both YouTube Premium and Google One Premium. Continue reading YouTube Premium Offers Speed Controls and Improved Audio