By
Paula ParisiDecember 18, 2024
Pika Labs has updated its generative video model, Pika 2.0 adding more user control and customizability, the company says. Improvements include better “text alignment,” making it easier to have the AI follow through with intricate prompts. Enhanced motion rendering is said to deliver more “naturalistic movement” and better physics, including greater believability in transformations that tend toward the surreal, which has typically been a challenge for genAI tools. The biggest change may be “Scene Ingredients,” which lets users add their own images when building Pika-generated videos. Continue reading Pika 2.0 Video Generator Adds Character Integration, Objects
By
Paula ParisiDecember 17, 2024
Meta’s FAIR (Fundamental AI Research) team has unveiled recent work in areas ranging from transparency and safety to agents, and architectures for machine learning. The projects include Meta Motivo, a foundation model for controlling the behavior of virtual embodied agents, and Video Seal, an open-source model for video watermarking. All were developed in the unit’s pursuit of advanced machine intelligence, helping “models to learn new information more effectively and scale beyond current limits.” Meta announced it is sharing the new FAIR research, code, models and datasets so the research community can build upon its work. Continue reading Meta Rolls Out Watermarking, Behavioral and Concept Models
By
Paula ParisiDecember 13, 2024
With AI powering a range of new world-building apps, 2025 could be the year the metaverse finally makes an impact. Midjourney joins the world-building club with Patchwork, a collaborate canvas for creating “infinite” fictional worlds. Now in research preview, the tool is being developed as a standalone app, though preview access requires a Midjourney Discord account linked to a Google account. Users are able to connect characters and worlds, and “share” their developing world — evolving as a “board” — with up to 100 collaborative partners on Midjourney (though the company recommends fewer participants for a more focused experience). Continue reading Midjourney Touts Collaborative World-Building App Patchwork
By
Paula ParisiDecember 10, 2024
Meta Platforms has packed more artificial intelligence into a smaller package with Llama 3.3, which the company released last week. The open-source large language model (LLM) “improves core performance at a significantly lower cost, making it even more accessible to the entire open-source community,” Meta VP of Generative AI Ahmad Al-Dahle wrote on X social. The 70 billion parameter text-only Llama 3.3 is said to perform on par with the 405 billion parameter model that was part of Meta’s Llama 3.1 release in July, with less computing power required, significantly lowering its operational costs. Continue reading Meta’s Llama 3.3 Delivers More Processing for Less Compute
By
Paula ParisiDecember 9, 2024
OpenAI has launched ChatGPT Pro, a $200 per month subscription plan that provides unlimited access to the full version of o1, its new large reasoning model, and all other OpenAI models. The toolkit includes o1-mini, GPT-4o and Advanced Voice. It also includes the new o1 pro mode, “a version of o1 that uses more compute to think harder and provide even better answers to the hardest problems,” OpenAI explains, describing the high-end subscription plan as a path to “research-grade intelligence” for a way for scientists, engineers, enterprise, academics and others who use AI to accelerate productivity. Continue reading OpenAI Announces $200 Monthly Subscription for ChatGPT Pro
By
Paula ParisiDecember 6, 2024
Google DeepMind’s new Genie 2 is a large foundation world model that generates interactive 3D worlds that are being likened to video games. “Games play a key role in the world of artificial intelligence research,” says Google DeepMind, noting “their engaging nature, challenges and measurable progress make them ideal environments to safely test and advance AI capabilities.” Based on a simple prompt image, Genie 2 is capable of producing “an endless variety of action-controllable, playable 3D environments” — suitable for training and evaluating embodied agents — that can be played by a human or AI agent using keyboard and mouse inputs. Continue reading DeepMind Genie 2 Creates Worlds That Emulate Video Games
By
Paula ParisiDecember 5, 2024
After years of focusing on AI infrastructure, Amazon is plunging into the frontier model business with the Nova series. The new family of generative AI models includes the text-to-text model Amazon Nova Micro and Amazon Nova Lite for fast, mobile-friendly apps, and at the upper echelon the multimodal Amazon Nova Pro and Amazon Nova Premier for processing text, images and video. Amazon, which is heavy into production via Amazon Studios and MGM, is also launched two specialty models focused on “studio quality” output — Amazon Nova Canvas for images and Amazon Nova Reel for video. Continue reading Amazon Dives into Generative AI with Nova Foundation Models
By
Paula ParisiDecember 4, 2024
Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability
By
Paula ParisiDecember 3, 2024
German media company Bertelsmann has partnered with AI startup ElevenLabs on an effort to drive tech innovation and workflow across Bertelsmann production, marketing and distribution. Bertelsmann operations span roughly 50 countries with businesses including the publisher Penguin Random House, record label BMG and the RTL Group television unit. The objective is for ElevenLabs tools in voice and audio generation to help Bertelsmann expand productivity and reach. In August, New York-based ElevenLabs opened a European headquarters in London, expanding its international footprint for text-to-speech and other audio apps. Continue reading Bertelsmann and ElevenLabs Team Up to Foster AI Production
By
Paula ParisiDecember 2, 2024
Anticipating what one outlet calls “the likely imminent release of OpenAI’s Sora,” generative AI video competitors are compelled to step up their game. Luma AI has released a major upgrade to its Dream Machine, speeding its already quick video generation and enabling a chat function for natural language prompts, so you can talk to it as with OpenAI’s ChatGPT. In addition to the new interface, Dream Machine is going mobile and adding a new foundation image model, Luma AI Photon, which “has been purpose built to advance the power and capabilities of Dream Machine,” according to the company. Continue reading Luma AI Upgrades Its Video Generator and Adds Image Model
By
Paula ParisiNovember 27, 2024
Nvidia has unveiled an AI sound model research project called Fugatto that “can create any combination of music, voices and sounds” based on text and audio inputs. Described by Nvidia as “the world’s most flexible sound machine,” many appear to agree that the new model represents an audio breakthrough, with the potential to generate a wide array of sounds that have not previously existed. While popular sound models from companies including Suno and ElevenLabs “can compose a song or modify a voice, none have the dexterity of the new offering,” Nvidia claims. Continue reading Nvidia AI Model Fugatto a Breakthrough in Generative Sound
By
Paula ParisiNovember 22, 2024
Nvidia sales were up 94 percent to $35 billion in the most recent quarter when profits more than doubled, to $19.3 billion, telegraphing the strength of the artificial intelligence boom that took the company from the top supplier of graphics boards for gaming PCs to the world’s most valuable public company with a market cap of $3.59 trillion. Nvidia founder and CEO Jensen Huang told analysts that demand for the company’s latest AI chip, Blackwell, has been “incredible,” driving projections of $3.59 trillion in revenue for the current quarter as customers begin to take shipments. Continue reading AI Boom Boosts Nvidia Sales by 94 Percent as Profits Double
By
Paula ParisiNovember 21, 2024
Promise is a new entertainment studio launched around the potential of generative AI. The Los Angeles-based startup is developing a multiyear slate of films, TV shows and media in “new formats.” With funding led by Peter Chernin’s North Road Company and Andreessen Horowitz, Promise vows to set “a new standard for high-quality storytelling enabled by AI.” The firm is also working on new tools to optimize the generative workflow. The first product, MUSE, “integrates the latest GenAI technology throughout the creative process in a streamlined, collaborative, and secure production environment.” Continue reading Promise Is an Entertainment Studio Built Around Generative AI
By
Paula ParisiNovember 19, 2024
AI is apparently whetting appetites for more than creative exploration. Yum Brands, which owns Taco Bell, KFC and Pizza Hut, says its new AI-driven marketing campaigns are driving more customers into stores, increasing purchases and reducing churn. Trials with “personalized marketing campaigns” that leverage artificial intelligence to produce are leading to strong results, according to the company. Meanwhile, Coca-Cola has revamped its circa 1995 “Holidays Are Coming” TV ad with the help of artificial intelligence and production studio Secret Level, though the critical and customer reaction to that has reportedly been mixed. Continue reading Small to Super-Sized Businesses Are Getting a Boost from AI
By
Paula ParisiNovember 19, 2024
A digital avatar may soon join the talent lineup on ESPN’s college football show “SEC Nation.” Called FACTS, the AI-generated character was developed at the ESPN Edge Innovation Center as “a way to help foster engagement and educate fans on complex sports analytics,” according to ESPN. The avatar was unveiled last week at the 4th Annual ESPN Edge Conference. Built on Nvidia’s Omniverse platform, using the company’s ACE microservices, FACTS integrates with Azure OpenAI for natural language processing and ElevenLabs for text-to-speech integration. Continue reading ESPN Readies a Data-Filled Sports Talk Host Generated by AI