Highly Realistic Alibaba GenVid Models Are Available for Free

Alibaba has open-sourced its Wan 2.1 video- and image-generating AI models, heating up an already competitive space. The Wan 2.1 family, which has four models, is said to produce “highly realistic” images and videos from text and images. The company has since December been previewing a new reasoning model, QwQ-Max, indicating it will be open-sourced when fully released. The move comes after another Chinese AI company, DeepSeek, released its R1 reasoning model for free download and use, triggering demand for more open-source artificial intelligence. Continue reading Highly Realistic Alibaba GenVid Models Are Available for Free

Muse Could Be a Gamechanger for Xbox Players, Developers

Microsoft has unveiled a new AI model called Muse that can generate game visuals and controller actions and understands 3D space. The new model can create complex gameplay sequences with accurate physics and character behaviors. Classified by Microsoft as the first World and Human Action Model (WHAM), Muse was trained from over seven years’ worth of human gameplay data from the Xbox game “Bleeding Edge,” published by UK-based Microsoft Games subsidiary Ninja Theory. Muse can, in addition to game goals, provide research insights to support all sorts of creative use of generative AI, Microsoft says. Continue reading Muse Could Be a Gamechanger for Xbox Players, Developers

Meta Adds Indigenous Languages to Speech and Translation AI

Meta is seeking to make AI more inclusive with a program to support underserved languages “and help bring their speakers into the digital conversation.” Meta’s Fundamental AI Research (FAIR) unit has teamed with UNESCO to launch the Language Technology Partner Program, which is looking for people who can provide more than 10 hours of speech recordings (with transcriptions) and chunks of written text (200+ sentences, with translation) in diverse languages. “Partners will work with our teams to help integrate these languages into AI-driven speech recognition and machine translation models, which when released will be open sourced,” Meta said. Continue reading Meta Adds Indigenous Languages to Speech and Translation AI

Hugging Face Has Developed Tiny Yet Powerful Vision Models

Most people know Hugging Face as a resource-sharing community, but it also builds open-source applications and tools for machine learning. Its recent release of vision-language models small enough to run on smartphones while outperforming competitors that rely on massive data centers is being hailed as “a remarkable breakthrough in AI.”  The new models — SmolVLM-256M and SmolVLM-500M — are optimized for “constrained devices” with less than around 1GB of RAM, making them ideal for mobile devices including laptops and also convenient for those interested in processing large amounts of data cheaply and with a low-energy footprint. Continue reading Hugging Face Has Developed Tiny Yet Powerful Vision Models

CES: Nvidia’s Cosmos Models Teach AI About Physical World

Nvidia Cosmos, a platform of generative world foundation models (WFMs) and related tools to advance the development of physical AI systems like autonomous vehicles and robots, was introduced at CES 2025. Cosmos WFMs are designed to provide developers a way to generate massive amounts of photo-real, physics-based synthetic data to train and evaluate their existing models. The goal is to reduce costs by streamlining real-world testing with a ready data pipeline. Developers can also build custom models by fine-tuning Cosmos WFMs. Cosmos integrates Nvidia Omniverse, a physics simulation tool used for entertainment world-building. Continue reading CES: Nvidia’s Cosmos Models Teach AI About Physical World

Meta’s Llama 3.3 Delivers More Processing for Less Compute

Meta Platforms has packed more artificial intelligence into a smaller package with Llama 3.3, which the company released last week. The open-source large language model (LLM) “improves core performance at a significantly lower cost, making it even more accessible to the entire open-source community,” Meta VP of Generative AI Ahmad Al-Dahle wrote on X social. The 70 billion parameter text-only Llama 3.3 is said to perform on par with the 405 billion parameter model that was part of Meta’s Llama 3.1 release in July, with less computing power required, significantly lowering its operational costs. Continue reading Meta’s Llama 3.3 Delivers More Processing for Less Compute

Lightricks LTX Video Model Impresses with Speed and Motion

Lightricks has released an AI model called LTX Video (LTXV) it says generates five seconds of 768 x 512 resolution video (121 frames) in just four seconds, outputting in less time than it takes to watch. The model can run on consumer-grade hardware and is open source, positioning Lightricks as a mass market challenger to firms like Adobe, OpenAI, Google and their proprietary systems. “It’s time for an open-sourced video model that the global academic and developer community can build on and help shape the future of AI video,” Lightricks co-founder and CEO Zeev Farbman said. Continue reading Lightricks LTX Video Model Impresses with Speed and Motion

Nvidia’s Impressive AI Model Could Compete with Top Brands

Nvidia has debuted a new AI model, Llama-3.1-Nemotron-70B-Instruct, that it claims is outperforming competitors GPT-4o from OpenAI and Anthropic’s Claude 3.5 Sonnet. The impressive showing has prompted speculation of an AI shakeup and a significant shift in Nividia’s AI strategy, which has thus far been focused primarily on chipmaking. The model was quietly released on Hugging Face, and Nvidia says as of October 1 it ranked first on three top automatic alignment benchmarks, “edging out strong frontier models” and vaulting Nvidia to the forefront of the LLM field in areas like comprehension, context and generation. Continue reading Nvidia’s Impressive AI Model Could Compete with Top Brands

‘EU AI Act Checker’ Holds Big AI Accountable for Compliance

A new LLM framework evaluates how well generative AI models are meeting the challenge of compliance with the legal parameters of the European Union’s AI Act. The free and open-source software is the product of a collaboration between ETH Zurich; Bulgaria’s Institute for Computer Science, Artificial Intelligence and Technology (INSAIT); and Swiss startup LatticeFlow AI. It is being billed as “the first evaluation framework of the EU AI Act for Generative AI models.” Already, it has found that some of the top AI foundation models are falling short of European regulatory goals in areas including cybersecurity resilience and discriminatory output. Continue reading ‘EU AI Act Checker’ Holds Big AI Accountable for Compliance

Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Nvidia has unveiled the NVLM 1.0 family of multimodal LLMs, a powerful open-source AI that the company says performs comparably to proprietary systems from OpenAI and Google. Led by NVLM-D-72B, with 72 billion parameters, Nvidia’s new entry in the AI race achieved what the company describes as “state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models.” Nvidia has made the model weights publicly available and says it will also be releasing the training code, a break from the closed approach of OpenAI, Anthropic and Google. Continue reading Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Allen Institute Announces Vision-Optimized Molmo AI Models

The Allen Institute for AI (also known as Ai2, founded by Paul Allen and led by Ali Farhadi) has launched Molmo, a family of four open-source multimodal models. While advanced models “can perceive the world and communicate with us, Molmo goes beyond that to enable one to act in their worlds, unlocking a whole new generation of capabilities, everything from sophisticated web agents to robotics,” according to Ai2. On some third-party benchmark tests, Molmo’s 72 billion parameter model outperforms other open AI offerings and “performs favorably” against proprietary rivals like OpenAI’s GPT-4o, Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet, Ai2 says. Continue reading Allen Institute Announces Vision-Optimized Molmo AI Models

Meta Unveils New Open-Source Multimodal Model Llama 3.2

Meta’s Llama 3.2 release includes two new multimodal LLMs, one with 11 billion parameters and one with 90 billion — considered small- and medium-sized — and two lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices. Included are pre-trained and instruction-tuned versions. In addition to text, the multimodal models can interpret images, supporting apps that require visual understanding. Meta says the models are free and open source. Alongside them, the company is releasing “the first official Llama Stack distributions,” enabling “turnkey deployment” with integrated safety. Continue reading Meta Unveils New Open-Source Multimodal Model Llama 3.2

Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

Alibaba Cloud last week globally released more than 100 new open-source variants of its large language foundation model, Qwen 2.5, to the global open-source community. The company has also revamped its proprietary offering as a full-stack AI-computing infrastructure across cloud products, networking and data center architecture, all aimed at supporting the growing demands of AI computing. Alibaba Cloud’s significant contribution was revealed at the Apsara Conference, the annual flagship event held by the cloud division of China’s e-retail giant, often referred to as the Chinese Amazon. Continue reading Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

Alibaba’s Latest Vision Model Has Advanced Video Capability

China’s largest cloud computing company, Alibaba Cloud, has released a new computer vision model, Qwen2-VL, which the company says improves on its predecessor in visual understanding, including video comprehension and text-to-image processing in languages including English, Japanese, French, Spanish, Chinese and others. The company says it can analyze videos of more than 20 minutes in length and is able to respond appropriately to questions about content. Third-party benchmark tests compare Qwen2-VL favorably to leading competitors and the company is releasing two open-source versions with a larger private model to come. Continue reading Alibaba’s Latest Vision Model Has Advanced Video Capability

OSI Aims for Industry Standard by Defining ‘Open Source AI’

Creating a universal definition of “open source AI” has generated a fair amount of debate and confusion, with many outfits using elastic parameters in order to achieve a fit. Now the Open Source Initiative (OSI) — “the authority that defines Open Source” — has issued what it hopes will become the baseline definition. That definition, which includes the ability to “use the system for any purpose and without having to ask for permission,” excludes a lot of AI platforms that currently describe themselves as “open,” many freely available only for non-commercial use. OSI’s remaining three parameters involve the ability to inspect the system and modify and share it. Continue reading OSI Aims for Industry Standard by Defining ‘Open Source AI’