Alibaba’s Latest Vision Model Has Advanced Video Capability

China’s largest cloud computing company, Alibaba Cloud, has released a new computer vision model, Qwen2-VL, which the company says improves on its predecessor in visual understanding, including video comprehension and text-to-image processing in languages including English, Japanese, French, Spanish, Chinese and others. The company says it can analyze videos of more than 20 minutes in length and is able to respond appropriately to questions about content. Third-party benchmark tests compare Qwen2-VL favorably to leading competitors and the company is releasing two open-source versions with a larger private model to come. Continue reading Alibaba’s Latest Vision Model Has Advanced Video Capability

Google Teases Astra AI Assistant and Debuts Gemini 1.5 Pro

Google is showing off a developmental chatbot it says represents the future of AI assistants. Called Project Astra, it has the ability to “see” and “hear,” remembering the information ingested, which it can then answer questions about — from simple queries such as “Where did I leave my glasses?” to unpacking and explaining computer code. Demonstrated at the Google I/O conference this week, Astra understands the world “just like people do” and is able to converse naturally, in real time. The company says some Project Astra features may come to Gemini late this year. Continue reading Google Teases Astra AI Assistant and Debuts Gemini 1.5 Pro