DeepMind Genie 2 Creates Worlds That Emulate Video Games

Google DeepMind’s new Genie 2 is a large foundation world model that generates interactive 3D worlds that are being likened to video games. “Games play a key role in the world of artificial intelligence research,” says Google DeepMind, noting “their engaging nature, challenges and measurable progress make them ideal environments to safely test and advance AI capabilities.” Based on a simple prompt image, Genie 2 is capable of producing “an endless variety of action-controllable, playable 3D environments” — suitable for training and evaluating embodied agents — that can be played by a human or AI agent using keyboard and mouse inputs. Continue reading DeepMind Genie 2 Creates Worlds That Emulate Video Games

Amazon Dives into Generative AI with Nova Foundation Models

After years of focusing on AI infrastructure, Amazon is plunging into the frontier model business with the Nova series. The new family of generative AI models includes the text-to-text model Amazon Nova Micro and Amazon Nova Lite for fast, mobile-friendly apps, and at the upper echelon the multimodal Amazon Nova Pro and Amazon Nova Premier for processing text, images and video. Amazon, which is heavy into production via Amazon Studios and MGM, is also launched two specialty models focused on “studio quality” output — Amazon Nova Canvas for images and Amazon Nova Reel for video. Continue reading Amazon Dives into Generative AI with Nova Foundation Models

Hume AI Introduces Voice Control and Claude Interoperability

Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability

Bertelsmann and ElevenLabs Team Up to Foster AI Production

German media company Bertelsmann has partnered with AI startup ElevenLabs on an effort to drive tech innovation and workflow across Bertelsmann production, marketing and distribution. Bertelsmann operations span roughly 50 countries with businesses including the publisher Penguin Random House, record label BMG and the RTL Group television unit. The objective is for ElevenLabs tools in voice and audio generation to help Bertelsmann expand productivity and reach. In August, New York-based ElevenLabs opened a European headquarters in London, expanding its international footprint for text-to-speech and other audio apps. Continue reading Bertelsmann and ElevenLabs Team Up to Foster AI Production

Luma AI Upgrades Its Video Generator and Adds Image Model

Anticipating what one outlet calls “the likely imminent release of OpenAI’s Sora,” generative AI video competitors are compelled to step up their game. Luma AI has released a major upgrade to its Dream Machine, speeding its already quick video generation and enabling a chat function for natural language prompts, so you can talk to it as with OpenAI’s ChatGPT. In addition to the new interface, Dream Machine is going mobile and adding a new foundation image model, Luma AI Photon, which “has been purpose built to advance the power and capabilities of Dream Machine,” according to the company. Continue reading Luma AI Upgrades Its Video Generator and Adds Image Model

Nvidia AI Model Fugatto a Breakthrough in Generative Sound

Nvidia has unveiled an AI sound model research project called Fugatto that “can create any combination of music, voices and sounds” based on text and audio inputs. Described by Nvidia as “the world’s most flexible sound machine,” many appear to agree that the new model represents an audio breakthrough, with the potential to generate a wide array of sounds that have not previously existed. While popular sound models from companies including Suno and ElevenLabs “can compose a song or modify a voice, none have the dexterity of the new offering,” Nvidia claims. Continue reading Nvidia AI Model Fugatto a Breakthrough in Generative Sound

AI Boom Boosts Nvidia Sales by 94 Percent as Profits Double

Nvidia sales were up 94 percent to $35 billion in the most recent quarter when profits more than doubled, to $19.3 billion, telegraphing the strength of the artificial intelligence boom that took the company from the top supplier of graphics boards for gaming PCs to the world’s most valuable public company with a market cap of $3.59 trillion. Nvidia founder and CEO Jensen Huang told analysts that demand for the company’s latest AI chip, Blackwell, has been “incredible,” driving projections of $3.59 trillion in revenue for the current quarter as customers begin to take shipments. Continue reading AI Boom Boosts Nvidia Sales by 94 Percent as Profits Double

Promise Is an Entertainment Studio Built Around Generative AI

Promise is a new entertainment studio launched around the potential of generative AI. The Los Angeles-based startup is developing a multiyear slate of films, TV shows and media in “new formats.” With funding led by Peter Chernin’s North Road Company and Andreessen Horowitz, Promise vows to set “a new standard for high-quality storytelling enabled by AI.” The firm is also working on new tools to optimize the generative workflow. The first product, MUSE, “integrates the latest GenAI technology throughout the creative process in a streamlined, collaborative, and secure production environment.” Continue reading Promise Is an Entertainment Studio Built Around Generative AI

Small to Super-Sized Businesses Are Getting a Boost from AI

AI is apparently whetting appetites for more than creative exploration. Yum Brands, which owns Taco Bell, KFC and Pizza Hut, says its new AI-driven marketing campaigns are driving more customers into stores, increasing purchases and reducing churn. Trials with “personalized marketing campaigns” that leverage artificial intelligence to produce are leading to strong results, according to the company. Meanwhile, Coca-Cola has revamped its circa 1995 “Holidays Are Coming” TV ad with the help of artificial intelligence and production studio Secret Level, though the critical and customer reaction to that has reportedly been mixed. Continue reading Small to Super-Sized Businesses Are Getting a Boost from AI

ESPN Readies a Data-Filled Sports Talk Host Generated by AI

A digital avatar may soon join the talent lineup on ESPN’s college football show “SEC Nation.” Called FACTS, the AI-generated character was developed at the ESPN Edge Innovation Center as “a way to help foster engagement and educate fans on complex sports analytics,” according to ESPN. The avatar was unveiled last week at the 4th Annual ESPN Edge Conference. Built on Nvidia’s Omniverse platform, using the company’s ACE microservices, FACTS integrates with Azure OpenAI for natural language processing and ElevenLabs for text-to-speech integration. Continue reading ESPN Readies a Data-Filled Sports Talk Host Generated by AI

YouTube Dream Track Toolset Introduces an AI Remix Feature

YouTube has added a new feature to its Dream Track toolset, which lets select U.S. creators use AI to generate songs using the vocals of artists including John Legend, Demi Lovato, Charli XCX, Charlie Puth and others. Now users can remix Dream Track songs using natural language to describe the changes they would like, stylistic and otherwise. Selecting the “restyle a track” option will steer users to creating a 30-second generative snippet for use in YouTube Shorts. The remixed snippets will credit the original song with “clear attribution” through the Short itself and the Shorts audio pivot page. It will also clearly indicate that the track was restyled with AI, according to Google. Continue reading YouTube Dream Track Toolset Introduces an AI Remix Feature

Particle Launches AI News App That Summarizes in Quick Hits

Particle, the AI-powered news aggregator created by a pair of Twitter alums, has launched after a year in beta. The iOS app summarizes current events in quick hits the startup says do not violate the copyrights of publishers whose news it shares. Instead of simply scraping publishers’ work for proprietary use, the startup seeks to compensate publishers and drive traffic to news sites with prominent links to sources accompanying each AI news summary. Developed by Sara Beykpour and Marcel Molina, Particle has raised more than $11 million in early funding led by Lightspeed. Continue reading Particle Launches AI News App That Summarizes in Quick Hits

Baidu’s Ernie AI Gets Improved Text-to-Image and App Builder

Ernie, the foundation model for Baidu’s generative AI, has been updated with iRAG technology to mitigate visual hallucinations and a no-code tool called Miaoda that creates apps using natural language. The company behind China’s largest search engine says Ernie now handles 1.5 billion daily user queries, up from 50 million circa its March 2023 launch (a 30x increase). Baidu also debuted Ernie-powered smart glasses from its Xiaodu Technology hardware unit. The Xiaodu AI Glasses features built-in voice activation and cameras for taking photos and video. The news was shared at this week’s Baidu World 2024 in Shanghai. Continue reading Baidu’s Ernie AI Gets Improved Text-to-Image and App Builder

Copilot Now Enables Custom AI Themes in Microsoft Outlook

Microsoft Copilot now helps subscription users create personal themes in Outlook using generative AI. In what Microsoft says is “the first instance of dynamic AI-generated theming in productivity applications,” Copilot can now display inboxes against dynamic backdrops based on geography, the weather, or anything else users can imagine. The new feature is available across all popular platforms: Windows, Mac, iOS, Android and the Web. Just like you might “spruce up your office with artwork or plants,” Copilot lets AI enhance your digital environment, according to Microsoft. Continue reading Copilot Now Enables Custom AI Themes in Microsoft Outlook

BodyTalk Dubs into 29 Languages with Facial Moves to Match

Panjaya is a AI startup that aims to disrupt the world of video dubbing with a way to generate “hyperrealistic” recreations of a person’s voice speaking a new language. The system also automatically modifies the imagery to match lip and other physical movements to match the new speech patterns. Called BodyTalk, the technique is the launch point for Panjaya as it emerges from the stealth in which it conducted its R&D the past three years, backed by $9.5 million from venture funds and angel backers. The startup describes BodyTalk as “AI dubbing that looks and feels as natural as the original.” Continue reading BodyTalk Dubs into 29 Languages with Facial Moves to Match