Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

Google has introduced Gemini 2.0, the latest version of its multimodal AI model, signaling a shift toward what the company is calling “the agentic era.” The upgraded model promises not only to outperform previous iterations on standard benchmarks but also introduces more proactive, or agentic, functions. The company announced that “Project Astra,” its experimental assistant, would receive updates that allow it to use Google Search, Lens, and Maps, and that “Project Mariner,” a Chrome extension, would enable Gemini 2.0 to navigate a user’s web browser to complete tasks autonomously. Continue reading Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

Midjourney Touts Collaborative World-Building App Patchwork

With AI powering a range of new world-building apps, 2025 could be the year the metaverse finally makes an impact. Midjourney joins the world-building club with Patchwork, a collaborate canvas for creating “infinite” fictional worlds. Now in research preview, the tool is being developed as a standalone app, though preview access requires a Midjourney Discord account linked to a Google account. Users are able to connect characters and worlds, and “share” their developing world — evolving as a “board” — with up to 100 collaborative partners on Midjourney (though the company recommends fewer participants for a more focused experience). Continue reading Midjourney Touts Collaborative World-Building App Patchwork

DeepMind Genie 2 Creates Worlds That Emulate Video Games

Google DeepMind’s new Genie 2 is a large foundation world model that generates interactive 3D worlds that are being likened to video games. “Games play a key role in the world of artificial intelligence research,” says Google DeepMind, noting “their engaging nature, challenges and measurable progress make them ideal environments to safely test and advance AI capabilities.” Based on a simple prompt image, Genie 2 is capable of producing “an endless variety of action-controllable, playable 3D environments” — suitable for training and evaluating embodied agents — that can be played by a human or AI agent using keyboard and mouse inputs. Continue reading DeepMind Genie 2 Creates Worlds That Emulate Video Games

Google DeepMind Touts AI-Powered Quantum Error Detection

Google DeepMind has come up with an error correction technique it says will make quantum computers more reliable, particularly at scale. While quantum computing holds tremendous promise — potentially able to solve in just a few hours problems it would take a conventional computer “billions of years” to figure out, Google claims — the systems are notoriously unstable, due to the delicacy of the “quantum state.” AlphaQubit is an AI-based decoder that identifies quantum computing errors with accuracy. Combining DeepMind’s machine learning expertise with Google Quantum AI error correction, the technique advances efforts to create a reliable quantum computer. Continue reading Google DeepMind Touts AI-Powered Quantum Error Detection

YouTube Dream Track Toolset Introduces an AI Remix Feature

YouTube has added a new feature to its Dream Track toolset, which lets select U.S. creators use AI to generate songs using the vocals of artists including John Legend, Demi Lovato, Charli XCX, Charlie Puth and others. Now users can remix Dream Track songs using natural language to describe the changes they would like, stylistic and otherwise. Selecting the “restyle a track” option will steer users to creating a 30-second generative snippet for use in YouTube Shorts. The remixed snippets will credit the original song with “clear attribution” through the Short itself and the Shorts audio pivot page. It will also clearly indicate that the track was restyled with AI, according to Google. Continue reading YouTube Dream Track Toolset Introduces an AI Remix Feature

Allen Institute Announces Vision-Optimized Molmo AI Models

The Allen Institute for AI (also known as Ai2, founded by Paul Allen and led by Ali Farhadi) has launched Molmo, a family of four open-source multimodal models. While advanced models “can perceive the world and communicate with us, Molmo goes beyond that to enable one to act in their worlds, unlocking a whole new generation of capabilities, everything from sophisticated web agents to robotics,” according to Ai2. On some third-party benchmark tests, Molmo’s 72 billion parameter model outperforms other open AI offerings and “performs favorably” against proprietary rivals like OpenAI’s GPT-4o, Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet, Ai2 says. Continue reading Allen Institute Announces Vision-Optimized Molmo AI Models

YouTube Adding Tools to Protect Against Unauthorized AI Use

YouTube is introducing AI detection tools designed to allow people to learn when their face and/or voice are copied and used in third-party videos. As part of the effort, YouTube’s existing Content ID program that protects copyrighted music will expand to include more broad-based voice simulation detection technology. The new tools aim to protect “people from a variety of industries — from creators and actors to musicians and athletes,” according to the company. The Google-owned platform is also coming up with a way to address unauthorized use of its content for training AI models. Continue reading YouTube Adding Tools to Protect Against Unauthorized AI Use

Google DeepMind Releases Imagen 3 for Free to U.S. Users

Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users

Green Startup CuspAI Secures $30 Million, Meta Partnership

CuspAI, a UK startup that aims to help engineer sustainable materials, has raised $30 million from venture funds in Europe and the U.S., secured a partnership with Meta’s FAIR unit, and employed the advisory services of AI pioneer Geoffrey Hinton in its mission to tackle climate change. “CuspAI leverages cutting-edge generative AI, deep learning, and molecular simulation to streamline the material design process,” the company announced. Its platform “functions like a search engine for materials, allowing users to request specific properties for new materials on demand,” speeding the process of “the discovery of materials with precise functionalities.” Continue reading Green Startup CuspAI Secures $30 Million, Meta Partnership

DeepMind’s V2A Generates Music, Sound Effects, Dialogue

Google DeepMind has unveiled new research on AI tech it calls V2A (“video-to-audio”) that can generate soundtracks for videos. The initiative complements the wave of AI video generators from companies ranging from biggies like OpenAI and Alibaba to startups such as Luma and Runway, all of which require a separate app to add sound. V2A technology “makes synchronized audiovisual generation possible” by combining video pixels with natural language text prompts “to generate rich soundscapes for the on-screen action,” DeepMind writes, explaining that it can “create shots with a dramatic score, realistic sound effects or dialogue.” Continue reading DeepMind’s V2A Generates Music, Sound Effects, Dialogue

Veo AI Image Generator and Imagen 3 Unveiled at Google I/O

Google is launching two new AI models: the video generator Veo and Imagen 3, billed as the company’s “highest quality text-to-image model yet.” The products were introduced at Google I/O this week, where new demo recordings created using the Music AI Sandbox were also showcased. The 1080p Veo videos can be generated in “a wide range of cinematic and visual styles” and run “over a minute” in length, Google says. Veo is available in private preview in VideoFX by joining a waitlist. At a future date, the company plans to bring some Veo capabilities to YouTube Shorts and other products. Continue reading Veo AI Image Generator and Imagen 3 Unveiled at Google I/O

Google Merges Android and Hardware Units for AI Efficiency

Google is implementing an internal reorganization that combines its Android and hardware teams. Google CEO Sundar Pichai announced a new Platforms & Devices team headed by Rick Osterloh, which includes Android, Chrome, ChromeOS, Photos and all Pixel products. Pichai says the move will help speed development. Osterloh’s mandate is full-stack platform development that smoothly incorporates AI across all Google platforms, including smartphones, TVs and anything with Android OS. Hiroshi Lockheimer, who previously ran ops for Android, Chrome and ChromeOS, moves on to other projects at Google and Alphabet. Continue reading Google Merges Android and Hardware Units for AI Efficiency

AI Crypto Firms Will Merge Tokens in Bid to Take on Big Tech

AI-centric crypto firms SingularityNET, Fetch.ai and Ocean Protocol will merge their tokens under a single token known as the Artificial Superintelligence token, ASI, with a fully diluted value of $7.6 billion. Although the three companies will continue to operate separately, an Artificial Superintelligence Alliance comprised of members from each will guide the vision. The ASIA goal is to develop decentralized AI technology on blockchain as an alternative to the large corporations currently controlling AI. The end game, to create artificial general intelligence. The merger is contingent on approval from each community’s members. Continue reading AI Crypto Firms Will Merge Tokens in Bid to Take on Big Tech

Conversational Chatbot Optimizes Google Ads, Search Results

Google’s multimodal Gemini large language model will offer chat capabilities that help advertisers build and scale Search campaigns within the Google Ads platform using natural language prompts. “We’ve been actively testing Gemini to further enhance our ads solutions, and, we’re pleased to share that Gemini is now powering the conversational experience,” Google said, explaining the functionality is now available in beta to English language advertisers in the U.S., UK and will be rolling out globally to all English language advertisers over the next few weeks, with additional languages offered in the months ahead. Continue reading Conversational Chatbot Optimizes Google Ads, Search Results

Suno Plugin Gives Microsoft Copilot a Music Creation Feature

Microsoft has added generative music capabilities to its Copilot chatbot by integrating a plugin from Cambridge, Massachusetts-based startup Suno AI. Microsoft calls Suno “a leader in AI music technology, pioneering the ability to generate complete songs — lyrics, instrumentals, and singing voices — from a single sentence.” Suno offers a generative tool on Discord. The Copilot plugin is specific to Microsoft, though the biggest difference is it will only generate one song per prompt as opposed to the app offered directly by Suno, which provides two. The songs are generally a minute or two in length, and come with lyric sheets. Continue reading Suno Plugin Gives Microsoft Copilot a Music Creation Feature