By
Paula ParisiOctober 1, 2024
The Allen Institute for AI (also known as Ai2, founded by Paul Allen and led by Ali Farhadi) has launched Molmo, a family of four open-source multimodal models. While advanced models “can perceive the world and communicate with us, Molmo goes beyond that to enable one to act in their worlds, unlocking a whole new generation of capabilities, everything from sophisticated web agents to robotics,” according to Ai2. On some third-party benchmark tests, Molmo’s 72 billion parameter model outperforms other open AI offerings and “performs favorably” against proprietary rivals like OpenAI’s GPT-4o, Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet, Ai2 says. Continue reading Allen Institute Announces Vision-Optimized Molmo AI Models
By
Paula ParisiSeptember 16, 2024
OpenAI is previewing a new series of AI models that can reason and correct complex coding mistakes, providing a more efficient solution for developers. Powered by OpenAI o1, the new models are “designed to spend more time thinking before they respond, much like a person would,” and as a result can “solve harder problems than previous models in science, coding, and math,” OpenAI claims, noting that “through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.” The first model in the series is being released in preview in OpenAI’s popular ChatGPT and in the company’s API. Continue reading OpenAI Previews New LLMs Capable of Complex Reasoning
By
Paula ParisiSeptember 5, 2024
China’s largest cloud computing company, Alibaba Cloud, has released a new computer vision model, Qwen2-VL, which the company says improves on its predecessor in visual understanding, including video comprehension and text-to-image processing in languages including English, Japanese, French, Spanish, Chinese and others. The company says it can analyze videos of more than 20 minutes in length and is able to respond appropriately to questions about content. Third-party benchmark tests compare Qwen2-VL favorably to leading competitors and the company is releasing two open-source versions with a larger private model to come. Continue reading Alibaba’s Latest Vision Model Has Advanced Video Capability
By
Paula ParisiSeptember 4, 2024
An AI-powered coding app called Cursor is building a fanbase, with everyone from hobbyists to engineers subscribing to the service. The platform reportedly has 30,000 paying customers, among them employees at OpenAI, Midjourney and Perplexity. Referred to as “the ChatGPT of coding,” Cursor uses popular models including GPT-4o and Claude 3.5 Sonnet to automate building apps and other coding tasks. Cursor was launched by two-year-old startup Anysphere, which has raised more than $60 million in Series A funding led by Andreessen Horowitz and Thrive Capital. Continue reading New AI Coding App Cursor Gains Following and $60M in Funds
By
Paula ParisiAugust 27, 2024
OpenAI announced its newest model, GPT-4o, can now be customized. The company said that the ability to fine-tune the multimodal GPT-4o has been “one of the most requested features from developers.” Customization can move the model toward more specific structure and tone of responses or allow it to follow specific instruction sets geared toward individual use cases. Developers can now implement custom datasets, aiming for better performance at a lower cost. The ChatGPT maker is rolling out the welcome mat by offering 1 million training tokens per day “for free for every organization” through September 23. Continue reading OpenAI Pushes GPT-4o Customization with Free Token Offer
By
Paula ParisiAugust 5, 2024
OpenAI has released its new Advanced Voice Mode in a limited alpha rollout for select ChatGPT Plus users. The feature, which is being implemented for the ChatGPT mobile app on Android and iOS, aims for more natural dialogue with the AI chatbot. Powered by GPT-4o, which is multimodal, Advanced Voice Mode is said to be able to sense emotional inflections, including excitement, sadness or singing. According to an OpenAI post on X, the company plans to “continue to add more people on a rolling basis” so that everyone using ChatGPT Plus will have access to the new feature in the fall. Continue reading OpenAI Brings Advanced Voice Mode Feature to ChatGPT Plus
By
Paula ParisiJuly 29, 2024
San Francisco-based OpenAI revealed it is currently testing SearchGPT, a prototype of new AI search features that provides “fast and timely answers with clear and relevant sources.” The testing arrives as similar technology is made available by leading search services Google and Microsoft Bing. The SearchGPT prototype, featuring a user interface similar to that of OpenAI’s ChatGPT chatbot and virtual assistant, launched last week to a group of 10,000 test users and publishers who will be tapped for feedback. The plan is to iterate an improved version and then integrate SearchGPT directly into ChatGPT, although no timeline was provided. Continue reading OpenAI Begins Testing Prototype of New AI Search Features
By
Paula ParisiJuly 15, 2024
OpenAI has partnered with the Los Alamos National Laboratory to study the ways artificial intelligence frontier models can assist with scientific research in an active lab environment. Established in 1943, the New Mexico facility is best known as home to the Manhattan Project and the development of the world’s first atomic bomb. It currently focuses on national security challenges under the direction of the Department of Energy. As part of the new partnership, the lab will work with OpenAI to produce what it describes as a first-of-its-kind study on the impact of artificial intelligence and biosecurity. Continue reading OpenAI Teams with Los Alamos for Frontier Model Research
By
Paula ParisiJuly 9, 2024
San Francisco-based optics company Solos has debuted its latest smart glasses, the Solos AirGo Vision, which offer a camera that takes photos and provides computer vision, and integrates OpenAI’s GPT-4o. The AirGo Vision can provide real-time information using visual input, recognizing people, objects and places, and providing information such as directions or instructions. Both the camera and AI functionality are hands-free, making the AirGo Vision “especially convenient for visual progress and next steps on activities like cooking, home improvement projects, education and studies, and even shopping,” the company explains. Continue reading Solos AirGo Vision Smart Glasses Tout a Camera and GTP-4o
By
Paula ParisiJune 21, 2024
Anthropic has launched a powerful new AI model, Claude 3.5 Sonnet, that can analyze text and images and generate text. That its release comes a mere three months after Anthropic debuted Claude 3 indicates just how quickly the field is developing. The Google-backed company says Claude 3.5 Sonnet has set “new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval).” Sonnet is Anthropic’s mid-tier model, between Haiku and, on the high-end, Opus. Anthropic says 3.5 Sonnet is twice as fast as 3 Opus, offering “frontier intelligence at 2x the speed.” Continue reading Anthropic’s Claude 3.5: ‘Frontier Intelligence at 2x the Speed’
By
Paula ParisiJune 18, 2024
Zeta Labs has raised $2.9 million in pre-seed round funding and launched JACE, an AI assistant that can autonomously complete complex tasks. The LLM-powered JACE agent executes in-browser actions on command. In fact, Zeta claims JACE is so autonomous that it eliminates the need to be sitting in front of a computer while it executes requests — just tell it what you’d like it to do and let it go. London-based Zeta says it will use the money to expand its engineering team, host training models and improve JACE’s speed and reliability. Continue reading UK’s Zeta Labs Unveils JACE, a Next Generation AI Assistant
By
Paula ParisiMay 30, 2024
OpenAI has begun training a new flagship artificial intelligence model to succeed GPT-4, the technology currently associated with ChatGPT. The new model — which some are already calling GPT-5, although OpenAI hasn’t yet shared its name — is expected to take the company’s compute to the next level as it works toward artificial general intelligence (AGI), intelligence equal to or surpassing human cognitive abilities. The company also announced it has formed a new Safety and Security Committee two weeks after dissolving the old one upon the departure of OpenAI co-founder and Chief Scientist Ilya Sutskever. Continue reading OpenAI Is Working on New Frontier Model to Succeed GPT-4
By
Paula ParisiMay 28, 2024
Meta Platforms has unveiled its first natively multimodal model, Chameleon, which observers say can make it competitive with frontier model firms. Although Chameleon is not yet released, Meta says internal research indicates it outperforms the company’s own Llama 2 in text-only tasks and “matches or exceeds the performance of much larger models” including Google’s Gemini Pro and OpenAI’s GPT-4V in a mixed-modal generation evaluation “where either the prompt or outputs contain mixed sequences of both images and text.” In addition, Meta calls Chameleon’s image generation “non-trivial,” noting that’s “all in a single model.” Continue reading Meta Advances Multimodal Model Architecture with Chameleon
By
Paula ParisiMay 24, 2024
Amazon is planning to add generative artificial intelligence to its decade-old voice assistant, Alexa, which will require a monthly fee to offset the cost of the technology, according to reports. Although the new Alexa price plan has not been disclosed, it is not expected to be included in the $139 yearly Amazon Prime plan. The possible move comes as Apple is also undergoing an AI overhaul of its voice assistant, Siri. Once considered precocious by many consumers, Siri and Alexa are now playing catch-up to AI assistants from leaders in the space including Google, Microsoft and OpenAI. Continue reading Amazon to Upgrade Alexa with AI, May Add Subscription Fee
OpenAI CTO Mira Murati announced during a live-streamed event today that the company is launching an updated version of its GPT-4 model that powers OpenAI’s popular chatbot. The new flagship AI model, GPT-4o is reportedly “much faster” and offers improved text, voice and vision capabilities. Murati said GPT-4o will be free to all users, while Plus users will enjoy “up to five times the capacity limits” available to free users. According to OpenAI, the new AI model “can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation.” Continue reading OpenAI Unveils Faster AI Model, Desktop Version of ChatGPT