By
ETCentric StaffApril 24, 2024
Virtual reality firm Pimax has unveiled two new headsets. The Crystal Super is a high-resolution performance model which starts at $1,800, while the Crystal Light will carry a base list of $700. The Crystal Super packs 29.5 million pixels and allows users to swap between QLED and micro-OLED panels, which Pimxax claims is a first. The Crystal Light offers the same 16.6 million pixels as its Crystal predecessor, but at a more affordable price. At its annual Frontier virtual event, Pimax also shared the specs for its 60G Airlink module, designed for high-fidelity wireless PCVR using WiGig technology. Continue reading Pimax Intros VR Headset with Switchable QLED, OLED Panels
By
ETCentric StaffApril 23, 2024
Sony’s new line of Bravia televisions focuses on MiniLED display tech with the high-end Bravia 9. There is also the OLED-based Bravia 8, and the company is keeping 2023’s A95L QD-OLED in the mix. But the spotlight is in the LED backlighting system that Sony has spent several years refining, XR Backlight Master Drive, which can assert precise control over each pixel. Sony says the technology is comparable to the underpinnings of its professional mastering monitors. The XR Backlight Master Drive system allocates LED resources using purpose-built silicon created by Sony for its MiniLED TVs. Continue reading Sony Rolls Out Brighter, Better-Sounding Bravia TVs for 2024
By
ETCentric StaffApril 22, 2024
Microsoft has developed VASA, a framework for generating lifelike virtual characters with vocal capabilities including speaking and singing. The premiere model, VASA-1, can perform the feat in real time from a single static image and a vocalization clip. The research demo showcases realistic audio-enhanced faces that can be fine-tuned to look in different directions or change expression in video clips of up to one minute at 512 x 512 pixels and up to 40fps “with negligible starting latency,” according to Microsoft, which says “it paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors.” Continue reading Microsoft’s VASA-1 Can Generate Talking Faces in Real Time
By
ETCentric StaffApril 19, 2024
Internet advertising revenues hit a record $225 billion in the U.S. in 2023, a 7.3 percent increase, according to a PwC report for the Interactive Advertising Bureau (IAB). The connected TV and audio categories saw double-digit growth, as did spending on e-commerce platforms, classified “retail media,” which rose 16.3 percent year-over-year, reaching $43.7 billion in 2023 as key players expanded their ad inventory. Video advertising revenue climbed 10.6 percent year-over-year, to $52.1 billion, with 42 percent of that revenue generated from CTV and OTT streaming. Continue reading IAB: U.S. Digital Advertising Hit Record $225 Billion Last Year
By
ETCentric StaffApril 18, 2024
Airchat is the latest app to take tech leaders in Silicon Valley by storm. Described as a “combination of voice notes and Twitter,” Airchat lets you follow other users and scroll through posts — adding replies, likes and shares — but the twist is the content is generated through audio recordings the app then transcribes. Airchat ranked 27th on the App Store’s social networking chart, even though users must be invited to join. Launched last year by Naval Ravikant, founder of AngelList, and erstwhile Tinder product exec Brian Norgard, Airchat was just relaunched on iOS and Android. Continue reading Audio-First Social Platform Airchat Has Successful Relaunch
By
ETCentric StaffApril 11, 2024
Google is moving its most powerful artificial intelligence model, Gemini 1.5 Pro, into public preview for developers and Google Cloud customers. Gemini 1.5 Pro includes what Google claims is a breakthrough in long context understanding, with the ability to run 1 million tokens of information “opening up new possibilities for enterprises to create, discover and build using AI.” Gemini’s multimodal capabilities allow it to process audio, video, text, code and more, which when combined with long context, “enables enterprises to do things that just weren’t possible with AI before,” according to Google. Continue reading Google Offers Public Preview of Gemini Pro for Cloud Clients
By
ETCentric StaffMarch 21, 2024
Deepgram’s new Aura software turns text into generative audio with a “human-like voice.” The 9-year-old voice recognition company has raised nearly $86 million to date on the strength of its Voice AI platform. Aura is an extremely low-latency text-to-speech voice AI that can be used for voice AI agents, the company says. Paired with Deepgram’s Nova-2 speech-to-text API, developers can use it to “easily (and quickly) exchange real-time information between humans and LLMs to build responsive, high-throughput AI agents and conversational AI applications,” according to Deepgram. Continue reading Deepgram’s Speech Portfolio Now Includes Human-Like Aura
By
ETCentric StaffMarch 11, 2024
Alibaba is touting a new artificial intelligence system that can animate portraits, making people sing and talk in realistic fashion. Researchers at the Alibaba Group’s Institute for Intelligent Computing developed the generative video framework, calling it EMO, short for Emote Portrait Alive. Input a single reference image along with “vocal audio,” as in talking or singing, and “our method can generate vocal avatar videos with expressive facial expressions and various head poses,” the researchers say, adding that EMO can generate videos of any duration, “depending on the length of video input.” Continue reading Alibaba’s EMO Can Generate Performance Video from Images
By
ETCentric StaffMarch 1, 2024
On the heels of ElevenLabs’ demo of a text-to-sound app unveiled using clips generated by OpenAI’s text-to-video artificial intelligence platform Sora, Pika Labs is releasing a feature called Lip Sync that lets its paid subscribers use the ElevenLabs app to add AI-generated voices and dialogue to Pika-generated videos and have the characters’ lips moving in sync with the speech. Pika Lip Sync supports both uploaded audio files and text-to-audio AI, allowing users to type or record dialogue, or use pre-existing sound files, then apply AI to change the voicing style. Continue reading Pika Taps ElevenLabs Audio App to Add Lip Sync to AI Video
By
ETCentric StaffMarch 1, 2024
Project Music GenAI Control, an experimental work from Adobe Research, is setting out to change how people create and edit custom audio and music. The prototype tool lets creators generate music from text prompts, “and then have fine-grained control to edit that audio for their precise needs,” according to Adobe. Designed to help create music for broadcasts, podcasts or other “audio that’s just the right mood, tone, and length,” it can generate music from text prompts like “powerful rock,” “happy dance” or “sad jazz,” says Adobe Research Senior Research Scientist Nicholas Bryan, a creator of the technology. Continue reading Adobe’s Prototype AI Tool Is a ‘Photoshop for Music-Making’
By
ETCentric StaffFebruary 29, 2024
Spotify is rolling out AUX, an in-house music advisory agency for brands. “With AUX, we’ll use our deep expertise to counsel brands about how best to use music to enrich their campaigns and connect them with emerging artists to help them reach new audiences,” Spotify announced, joining Meta Platforms, YouTube, Snapchat and others in connecting creatives with brands. AUX aims to provide emerging artists with an avenue to another potential income source, as well as a path to wider exposure, as the idea is to get brands to pay Spotify to access the new service. Continue reading Spotify In-House Agency AUX to Connect Brands with Music
By
ETCentric StaffFebruary 28, 2024
The Walt Disney Company has selected five companies to be in its annual Accelerator program, three of them AI startups, one in robotics and one developing VR. The program, now in its tenth year, identifies promising new tech companies to benefit from Disney funding and mentorship in exchange for an inside track on talent and acquisitions. The class of 2024 includes AudioShake, which leverages AI to aid in mixing and dubbing audio tracks for mixing or dubbing; ElevenLabs, which has a text-to-speech app for GenAI voicing; and Promethean AI, a digital archives search platform that informs prototype design. Continue reading Latest Disney Accelerator Backs AI, VR, Autonomous Vehicles
By
ETCentric StaffFebruary 22, 2024
“What if you could describe a sound and generate it with AI?,” asks startup ElevenLabs, which set out to do just that, and says it has succeeded. The two-year-old company explains it “used text prompts like ‘waves crashing,’ ‘metal clanging,’ ‘birds chirping,’ and ‘racing car engine’ to generate audio.” Best known for using machine learning to clone voices, the AI firm founded by Google and Palantir alums has yet to make publicly available its new text-to-sound model but began teasing it by releasing online demos this week. Some see the technology as a natural complement to the latest wave of image generators. Continue reading ElevenLabs Promotes Its Latest Advances in AI Audio Effects
By
ETCentric StaffFebruary 21, 2024
Researchers at Amazon have trained what they are calling the largest text-to-speech model ever created, which they claim is exhibiting “emergent” qualities — the ability to inherently improve itself at speaking complex sentences naturally. Called BASE TTS, for Big Adaptive Streamable TTS with Emergent abilities, the new model could pave the way for more human-like interactions with AI, reports suggest. Trained on 100,000 hours of public domain speech data, BASE TTS offers “state-of-the-art naturalness” in English as well as some German, Dutch and Spanish. Text-to-speech models are used in developing voice assistants for smart devices and apps and accessibility. Continue reading Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model
By
Paula ParisiJanuary 24, 2024
Sennheiser has updated its flagship Momentum True Wireless earbuds, adding support for Qualcomm’s aptX audio tech. The company also debuted a Momentum Sport edition that tracks heart rate and body temperature. The Momentum True Wireless 4 promises “unparalleled sound,” combining Sennheiser’s audio expertise with Qualcomm’s S5 Sound Gen 2 platform and Snapdragon Sound Technology with aptX for lossless sound and ultra-low latency. Boasting 7.5 hours of continuous listening, the new buds come in black copper, metallic silver, and graphite for $300. The more rugged Momentum Sport with biometric features lists for $330. Continue reading CES: Sennheiser Touts Its New Wireless Momentum Earbuds