Audio Archives - Page 3 of 36

OpenAI: sCM Generates Media 50x Faster Than Other Models

By Paula Parisi
October 28, 2024

OpenAI is taking a new approach to generating media that it says is 50 times faster than the models commonly used today. Called sCM, the approach is a “consistency model,” a variation on the diffusion method used by many leading systems. OpenAI claims its new model is ideal for training for large scale datasets and generating video, audio and images that are of “comparable sample quality to leading diffusion models.” Such models often require hundreds of steps, creating challenges when it comes to real-time applications. OpenAI aims to change this with a faster system that requires less power. Continue reading OpenAI: sCM Generates Media 50x Faster Than Other Models

Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

By Paula Parisi
October 25, 2024

Runway is launching Act-One motion capture system that uses video and voice recordings to map human facial expressions onto characters using the company’s latest model, Gen-3 Alpha. Runway calls it “a significant step forward in using generative models for expressive live action and animated content.” Compared to past facial capture techniques — which typically require complex rigging — Act-One is driven directly and only by the performance of an actor, requiring “no extra equipment,” making it more likely to capture and preserve an authentic, nuanced performance, according to the company. Continue reading Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

Amazon’s Entry Level Fire Stick HD Adds Alexa Voice Control

By Paula Parisi
October 22, 2024

Amazon has launched a new Fire TV Stick HD, supplanting the Fire TV Stick and Fire TV Stick Lite as its entry level television device. Priced at $34.99 the black stick plugs into the HDMI port at the back or side of most TVs. A micro USB cable and power plug are included. The Fire TV Stick HD streams at up to 1080p HD and also supports HDR, HDR 10, HDR10+ and HLG. While the new device does not feature support for Dolby Vision or Dolby Atmos, its HDMI port will support Dolby-encoded audio. The platform streamlines access to all major streaming services, which of course require independent subscriptions. Continue reading Amazon’s Entry Level Fire Stick HD Adds Alexa Voice Control

Qobuz Queues Up Hi-Resolution DSD and DXD Music Tracks

By Paula Parisi
October 15, 2024

Qobuz, an audio platform in operation since 2007 that has been growing in popularity worldwide, recently added the Direct Stream Digital (DSD) and Digital eXtreme Definition (DXD) audio formats as download options in its online store. Described as the ultimate in hi-res digital audio capture and playback, the two are not commonly offered. And while there is not yet a large catalog of songs, audiophiles welcome the news as a harbinger of things to come. Qobuz has already added about 22,500 mainly DSD tracks to its catalog of more than 100 million songs, and says there are more on tap. Continue reading Qobuz Queues Up Hi-Resolution DSD and DXD Music Tracks

Meta’s Movie Gen Model is a Powerful Content Creation Tool

By Paula Parisi
October 8, 2024

Meta Platforms has unveiled Movie Gen, a new family of AI models that generates video and audio content. Coming to Instagram next year, Movie Gen also allows a high degree of editing and effects customization using text prompts. Meta CEO Mark Zuckerberg demonstrated its abilities last week in an example shared on his Instagram account, where he sends a leg press machine at the gym through transformations as a steam punk machine and one made of molten gold. The models have been trained on a combination of licensed and publicly available datasets. Continue reading Meta’s Movie Gen Model is a Powerful Content Creation Tool

YouTube Updates Shorts Player, Extends Length to 3 Minutes

By Paula Parisi
October 7, 2024

Beginning October 15, YouTube Shorts will extend its maximum length to 3 minutes. The move competitively positions the Google unit against TikTok, which allows for videos of up to 10 minutes when recording, or an hour when uploading. Regular YouTube accommodates videos of up to 12 hours for verified accounts and 15 minutes for unverified accounts, whether live or uploaded. But in terms of marketing focus, the current attention is on short-form video. YouTube is also updating the Shorts player, adding templates, and introducing a Shorts trends page for mobile. Continue reading YouTube Updates Shorts Player, Extends Length to 3 Minutes

Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

By Paula Parisi
October 2, 2024

Snap Inc. is leveraging its relationship with Google Cloud to use Gemini for powering generative AI experiences within Snapchat’s My AI chatbot. The multimodal capabilities of Gemini on Vertex AI will greatly increase the My AI chatbot’s ability to understand and operate across different types of information such as text, audio, image, video and code. Snapchatters can use My AI to take advantage of Google Lens-like features, including asking the chatbot “to translate a photo of a street sign while traveling abroad, or take a video of different snack offerings to ask which one is the healthiest option.” Continue reading Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

Google Unveils New Updates to Its AI-Powered NotebookLM

By Paula Parisi
October 1, 2024

Google has updated its AI assistant, NotebookLM, allowing the AI note-taking and research tool to find summaries of audio files and YouTube videos. First released at the Google I/O developer conference in 2023, NotebookLM even creates sharable AI-generated audio discussions and podcasts. It allows users to upload file formats including PDFs, Google Docs, Google Slides and websites. The items, including text, can be stored in shareable “notebooks,” organizing material in a central location, and users can ask Google’s Gemini AI questions about the notebook material. Initially embraced by students and educators, it has become equally popular among business users. Continue reading Google Unveils New Updates to Its AI-Powered NotebookLM

Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

By Paula Parisi
September 25, 2024

Alibaba Cloud last week globally released more than 100 new open-source variants of its large language foundation model, Qwen 2.5, to the global open-source community. The company has also revamped its proprietary offering as a full-stack AI-computing infrastructure across cloud products, networking and data center architecture, all aimed at supporting the growing demands of AI computing. Alibaba Cloud’s significant contribution was revealed at the Apsara Conference, the annual flagship event held by the cloud division of China’s e-retail giant, often referred to as the Chinese Amazon. Continue reading Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

GoPro’s Hero13 Black Earns Adds New Lens Mount and HLG HDR

By Paula Parisi
September 19, 2024

GoPro has announced two new cameras, the $399 Hero13 Black with swappable lenses, and its smallest 4K camera ever, the $199 Hero. The high-end Hero13 Black boasts better battery performance and four interchangeable Hero Black-series lens modules with automatic adjustments for settings. A 13x Burst Slo-Mo feature captures up to 400 frames per second at 720p, with options for 5.3K at 120 frames per second or 900p at 360 fps. Improved Wi-Fi 6 uploads at up to 40 percent faster transfer speeds and enhanced audio and voice settings are among the upgrades. Continue reading GoPro’s Hero13 Black Earns Adds New Lens Mount and HLG HDR

Blackmagic Camera for Android Adds Array of New Features

By Paula Parisi
September 19, 2024

Blackmagic Design is releasing its Blackmagic Camera for Android 1.3 update, which adds support for recording timecode and adds anamorphic lens de-squeeze functionality and lens correction settings as well as support for off-speed and time lapse recording. Available at Google Play free of charge, it supports Google’s latest OS, Android 14, which means it should offer some interesting creative possibilities with Gemini, the new Pixel 9 series’ native AI. Some features are backward compatible. Customers with Pixel 6, 7, 8 and 9 phones can record at frames rates of 120fps and 240fps at 720p, and 120fps at 1080p. Continue reading Blackmagic Camera for Android Adds Array of New Features

Amazon Is Inviting Audible Narrators to Create AI Voice Clones

By Paula Parisi
September 12, 2024

Amazon is aiming to speed up production of its Audible audiobooks by inviting a small group of narrators to clone their voices using generative artificial intelligence. The U.S. beta test will roll out later this year according to Amazon, which announced the move on Audible’s creator marketplace. “There is a vast catalog of books that does not yet exist in audio and as we explore ways to bring more books to life on Audible, we’re committed to thoughtfully balancing the interests of authors, narrators, publishers, and listeners,” Amazon explains. Continue reading Amazon Is Inviting Audible Narrators to Create AI Voice Clones

Will.i.am Launches AI-Powered Interactive Service RAiDiO.FYI

By Paula Parisi
September 3, 2024

Musician and tech entrepreneur will.i.am is launching an interactive radio service built around conversational AI. Called RAiDiO.FYI, the service lets listeners talk to artificial intelligence serving as DJs as part of a one-on-one exchange designed as a personalized listening experience. RAiDiO.FYI’s AI DJs are trained to converse about topics ranging from music to sports, weather and breaking news. The new service is an offshoot of the performer’s FYI.AI, a platform of digital tools for artists. Users can access RAiDiO.FYI for free on the FYI app for iPhone and Android. Continue reading Will.i.am Launches AI-Powered Interactive Service RAiDiO.FYI

ElevenLabs Reader App Is Available Globally in 32 Languages

By Paula Parisi
August 29, 2024

New York-based ElevenLabs is going global with its generative AI text-to-speech reader app, which can narrate writings in 32 languages with thousands of voices from which to choose. The audio startup promises “high quality, human-like” AI voices that are “emotionally and contextually aware,” adapting delivery of written cues “to achieve a high emotional range.” ElevenLabs has focused on “creative workflow,” with a voice isolator and audio effects generator tools. Its catalog includes the voices of celebrities Judy Garland, Laurence Olivier, James Dean and Burt Reynolds. Custom models for translation and voiceover work using contemporary actors is a future possibility. Continue reading ElevenLabs Reader App Is Available Globally in 32 Languages

Bill Mandating GenAI Watermarks Gains Support in California

By Paula Parisi
August 28, 2024

Adobe, OpenAI and Microsoft are among the major firms backing a California bill that would require tech companies to label AI-generated content with watermarks embedded in the metadata. Such data is easily accessible via browser for material circulated on the Internet, and the initiative would likely involve a campaign to educate the general public on how to find it. The proposed law encompasses video and audio as well as images. The three companies currently supporting the bill initially opposed it, using terms like “unworkable” and “overly burdensome.” Continue reading Bill Mandating GenAI Watermarks Gains Support in California