Google is showing off a developmental chatbot it says represents the future of AI assistants. Called Project Astra, it has the ability to “see” and “hear,” remembering the information ingested, which it can then answer questions about — from simple queries such as “Where did I leave my glasses?” to unpacking and explaining computer code. Demonstrated at the Google I/O conference this week, Astra understands the world “just like people do” and is able to converse naturally, in real time. The company says some Project Astra features may come to Gemini late this year.
Astra understands context and is able to take action,” Google explains in a blog post, calling it “proactive, teachable and personal” and able to converse with users “naturally and without lag or delay.”
A prerecorded video played at I/O shows a person walking through a London office first with a smartphone camera on, later using smart glasses, so Astra can take in the surroundings and respond to questions about the environment.
“[Apple’s] Siri and [Amazon’s] Alexa never managed to be useful assistants, but Google and others are convinced the next generation of bots is really going to work,” writes The Verge, quoting Google DeepMind CEO Demis Hassabis likening the multimodal Astra with ubiquitous helpers like the Star Trek Communicator and the digital assistant from the film “Her.”
With regard to the smart glasses, Tom’s Guide says “to be clear, Astra is coming to phones first and will be called Gemini Live, but it has the potential to move to other form factors over time,” calling the demos “impressive.”
The Alphabet company also announced the debut of Gemini 1.5 Pro for Gemini Advanced users. The model has what Google calls a “breakthrough” long context window of 1 million tokens. It is also “available with a 2 million token context window via waitlist to developers using the API and to Google Cloud customers.”
Based on user requests for a cheaper, “lighter-weight,” Google is also introducing Gemini 1.5 Flash. The new Gemini 1.5 represent “progress towards our ultimate goal of infinite context,” Alphabet and Google CEO Sundar Pichai said at the I/O event, according to CNET.
Hassabis said 1.5 Flash “features the multimodal reasoning capabilities and long context of Pro but is designed for speed and efficiency — ‘optimized for tasks where low latency and cost matter most’” and able to serve at scale, notes CNET.
Both 1.5 Pro and 1.5 Flash are now in public preview with the million-token context window, available at Google AI Studio and Vertex AI.
Related:
Everything Google Announced at I/O 2024, Wired, 5/14/24
Google Takes the Next Step in Its AI Evolution, The New York Times, 5/14/24
Google Experiments with Using Video to Search, Thanks to Gemini AI, TechCrunch, 5/14/24
Google’s Generative AI Can Now Analyze Hours of Video, TechCrunch, 5/14/24
Google Expands Digital Watermarks to AI-Made Video and Text, Engadget, 5/14/24
Introducing VideoFX, Plus New Features for ImageFX and MusicFX, Google Blog, 5/14/24
Project IDX, Google’s Next-Gen IDE, Is Now in Open Beta, TechCrunch, 5/14/24
Google’s New Private Space Feature Is Like Incognito Mode for Android, TechCrunch, 5/15/24
Google Will Use Gemini to Detect Scams During Calls, TechCrunch, 5/14/24
Google’s Call-Scanning AI Could Dial Up Censorship by Default, TechCrunch 5/15/24
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.