Google Announces the Launch of Gemini, Its Largest AI Model

Google is closing the year by heralding 2024 as the “Gemini era,” with the introduction of its “most capable and general AI model yet,” Gemini 1.0. This new foundation model is optimized for three different use-case sizes: Ultra, Pro and Nano. As a result, Google is releasing a new, Gemini-powered version of its Bard chatbot, available to English speakers in the U.S. and 170 global regions. Google touts Gemini as built from the ground up for multimodality, reasoning across text, images, video, audio and code. However, Bard will not as yet incorporate Gemini’s ability to analyze sound and images.

“These are the first models of the Gemini era and the first realization of the vision we had when we formed Google DeepMind,” Alphabet and Google CEO Sundar Pichai wrote in a blog post. “This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.”

Neuroscience researcher Demis Hassabis — co-founder of DeepMind and CEO of Google DeepMind — in the same post calls Genesis “the most capable and general model we’ve ever built.”

The New York Times says Google has “released benchmark test results claiming that Gemini’s most powerful version outperformed OpenAI’s latest technology, GPT-4, in several key areas.” Highlights of those tests are available on Google’s Gemini capabilities page, and detailed in a technical report.

NYT writes that Gemini “is better at generating computer code than previous Google technologies” and “can more accurately summarize news articles and other text documents.”

The three versions of Gemini, optimized for different capabilities, will roll out in stages, with the lightweight Nano debuting this week on Pixel 8 Pro smartphones.

Nano “will now be the brains behind an AI summarizer feature in the Recorder app on the Pixel 8 Pro,” writes TechCrunch, noting that Recorder “which lets users push a button to record and transcribe audio, will now include a Gemini-powered summary of your recorded conversations, interviews, presentations or other audio.”

Gemini transcription will be available even without a cell or Wi-Fi connection, and the feature works in 28 languages.

The mid-tier Gemini Pro will be available starting next week in support of basic Google services. Ultra, designed for complex tasks, will be commercially deployed starting next year, though NYT says third-parties are testing it now.

In other Google news, Engadget reports that the company is announcing “creation of its most powerful TPU (formally known as Tensor Processing Units) yet, Cloud TPU v5p, and an AI Hypercomputer from Google Cloud.”

The Cloud TPU v5p AI accelerator is built to train and serve large models with long training periods, Engadget says, adding that “each TPU v5p pod brings 8,960 chips when using Google’s highest-bandwidth inter-chip interconnect.”

Related:
Google Admits That a Gemini AI Demo Video Was Staged, Engadget, 12/8/23
Gemini: Google’s Newest and Most Capable AI Model (Video), Google, 12/7/23
Hands-On with Gemini: Interacting with Multimodal AI (Video), Google, 12/7/23

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.