AI Voice Archives - ETCentric

Meta Is Increasing WhatsApp Ad Capabilities and AI Support

By Paula Parisi
July 3, 2025

Meta Platforms is expanding AI support for WhatsApp, adding voice calling to customers for enterprise accounts, a feature that’s been available to small businesses for years. “In the coming weeks, larger businesses using the WhatsApp Business Platform will be able to receive a call from a customer when they want to talk to someone live or call a customer directly” on request. Meta announced in Miami at its fourth annual Conversations business messaging conference, saying the move “paves the way for AI-enabled voice support in the future.” The company also added WhatsApp to centralized marketing across Facebook and Instagram. Continue reading Meta Is Increasing WhatsApp Ad Capabilities and AI Support

Authors Can Use ElevenLabs Audiobook Narration for Spotify

By Paula Parisi
February 24, 2025

Spotify is boosting its audiobook content by agreeing to accept material narrated using ElevenLabs’ AI voice app. Given that ElevenLabs is currently among the most recognized AI audio providers, this new partnership is expected to boost the quantity of AI-narrated audiobooks on the platform. ElevenLabs content can be distributed to Spotify (and “select other audiobook retailers”) via Spotify’s Findaway Voices platform for indie authors. For $99 per month, authors can generate up to 500 minutes of AI audio startup ElevenLabs’ narration in 29 languages with what Spotify says is “complete control over voice and intonation.” Continue reading Authors Can Use ElevenLabs Audiobook Narration for Spotify

MiniMax’s Hailuo AI Rolls Out New Image-to-Video Capability

By Paula Parisi
October 11, 2024

Hailuo, the free text-to-video generator released last month by the Alibaba-backed company MiniMax, has delivered its promised image-to-video feature. Founded by AI researcher Yan Junjie, the Shanghai-based MiniMax also has backing from Tencent. The model earned high marks for what has been called “ultra realistic” video, and MiniMax says the new image-to-video feature will improve output across the board as a result of “text-and-image joint instruction following,” which means Hailuo now “seamlessly integrates both text and image command inputs, enhancing your visuals while precisely adhering to your prompts.” Continue reading MiniMax’s Hailuo AI Rolls Out New Image-to-Video Capability

OpenAI Showcases Latest Updates for Voice, Picture and More

By Paula Parisi
October 3, 2024

OpenAI unveiled major updates at its DevDay conference with the focus largely on making AI more accessible, efficient and affordable. Included were four innovations: Vision Fine-Tuning in the API, Model Distillation, Prompt Caching and the public beta of Realtime API. The approach underscores OpenAI’s effort to empower its developer ecosystem even as it continues to compete for end-users in the enterprise space. The Realtime API gives developers the option of building “nearly real-time” speech-to-speech app experiences, selecting from among six OpenAI voices. Vision Fine-Tuning for GPT-4o enables customization of the model’s visual understanding of images and text. Continue reading OpenAI Showcases Latest Updates for Voice, Picture and More

Meta Reveals Orion Concept Glasses, Celeb Voices, Quest 3S

By Paula Parisi
September 26, 2024

Meta has secured rights to the voices of actors Judi Dench, Kristen Bell, John Cena and others for its Meta AI chatbot, a ChatGPT-like digital assistant that is part of the plan for conversational AI as part of the multimodal Llama 3.2. Also revealed at Meta Connect this week was Orion, “the most advanced glasses the world has ever seen,” queued up to become Meta’s “first consumer full holographic AR glasses,” though they won’t be available anytime soon. A low-priced Quest 3 mixed reality headset, the $299 Quest 3S, will be arriving in time for the holidays, however. Continue reading Meta Reveals Orion Concept Glasses, Celeb Voices, Quest 3S

OpenAI Rolls Out Advanced Voice Mode Feature for ChatGPT

By Paula Parisi
September 26, 2024

As OpenAI gears up to become a for-profit company next year, it is releasing ChatGPT Advanced Voice Mode, which brings a humanlike conversation mode to ChatGPT 4o. All U.S. subscribers to ChatGPT Plus and Team plans will gain access to the new feature, which will also be made available to those paying for ChatGPT Edu and Enterprise plans in the coming weeks. The firm is also adding five new voices and allowing customers to save personalized instructions for the voice assistant, including memory behaviors. Concurrently, executives including CTO Mira Murati have resigned as the company pivots to commerciality. Continue reading OpenAI Rolls Out Advanced Voice Mode Feature for ChatGPT

Google Begins Rolling Out Gemini Live Free to Android Users

By Paula Parisi
September 18, 2024

Google announced the company is making its new AI assistant Gemini Live available free to all Android users. The move follows the feature’s release last month to Gemini Advanced subscribers. This general release will occur gradually, and only in English for the time being. Gemini Live lets users have a more natural, free-flowing conversation with their phones than was available through Google Assistant via the “Hey, Google” prompt. Gemini inquiries are meant to be conversational, eliciting a back and forth that queriers can interrupt, adding more detail or veering to another topic entirely. Continue reading Google Begins Rolling Out Gemini Live Free to Android Users

Gemini Powering Google Vids Multimedia Presentation Builder

By Paula Parisi
July 18, 2024

Google has launched the beta version of its Gemini-powered Google Vids productivity app, which lets users create work-related video presentations that embed documents, slides, audio recordings and even additional videos into a timeline. Incorporated into Workspace Labs, Google’s AI preview space, Google says invited participants can use Vids to “build a narrative with high quality templates” or “get to a first draft faster.” Access to Google’s royalty-free stock content library and Vids recording studio means a project can be completed “without ever leaving Workspace,” according to the company. Continue reading Gemini Powering Google Vids Multimedia Presentation Builder

ElevenLabs Launches an AI Tool for Generating Sound Effects

By Paula Parisi
June 6, 2024

ElevenLabs has launched its text-to-sound generator Sound Effects for all users, available now at the company’s website. The new AI tool can create audio effects, short instrumental tracks, soundscapes and even character voices. Sound Effects “has been designed to help creators — including film and television studios, video game developers, and social media content creators — generate rich and immersive soundscapes quickly, affordably and at scale,” according to the startup, which developed the tool in partnership with Shutterstock, using its library of licensed audio tracks. Continue reading ElevenLabs Launches an AI Tool for Generating Sound Effects

Deepgram’s Speech Portfolio Now Includes Human-Like Aura

By ETCentric Staff
March 21, 2024

Deepgram’s new Aura software turns text into generative audio with a “human-like voice.” The 9-year-old voice recognition company has raised nearly $86 million to date on the strength of its Voice AI platform. Aura is an extremely low-latency text-to-speech voice AI that can be used for voice AI agents, the company says. Paired with Deepgram’s Nova-2 speech-to-text API, developers can use it to “easily (and quickly) exchange real-time information between humans and LLMs to build responsive, high-throughput AI agents and conversational AI applications,” according to Deepgram. Continue reading Deepgram’s Speech Portfolio Now Includes Human-Like Aura