Nvidia’s Impressive AI Model Could Compete with Top Brands

Nvidia has debuted a new AI model, Llama-3.1-Nemotron-70B-Instruct, that it claims is outperforming competitors GPT-4o from OpenAI and Anthropic’s Claude 3.5 Sonnet. The impressive showing has prompted speculation of an AI shakeup and a significant shift in Nividia’s AI strategy, which has thus far been focused primarily on chipmaking. The model was quietly released on Hugging Face, and Nvidia says as of October 1 it ranked first on three top automatic alignment benchmarks, “edging out strong frontier models” and vaulting Nvidia to the forefront of the LLM field in areas like comprehension, context and generation. Continue reading Nvidia’s Impressive AI Model Could Compete with Top Brands

‘EU AI Act Checker’ Holds Big AI Accountable for Compliance

A new LLM framework evaluates how well generative AI models are meeting the challenge of compliance with the legal parameters of the European Union’s AI Act. The free and open-source software is the product of a collaboration between ETH Zurich; Bulgaria’s Institute for Computer Science, Artificial Intelligence and Technology (INSAIT); and Swiss startup LatticeFlow AI. It is being billed as “the first evaluation framework of the EU AI Act for Generative AI models.” Already, it has found that some of the top AI foundation models are falling short of European regulatory goals in areas including cybersecurity resilience and discriminatory output. Continue reading ‘EU AI Act Checker’ Holds Big AI Accountable for Compliance

Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Nvidia has unveiled the NVLM 1.0 family of multimodal LLMs, a powerful open-source AI that the company says performs comparably to proprietary systems from OpenAI and Google. Led by NVLM-D-72B, with 72 billion parameters, Nvidia’s new entry in the AI race achieved what the company describes as “state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models.” Nvidia has made the model weights publicly available and says it will also be releasing the training code, a break from the closed approach of OpenAI, Anthropic and Google. Continue reading Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Allen Institute Announces Vision-Optimized Molmo AI Models

The Allen Institute for AI (also known as Ai2, founded by Paul Allen and led by Ali Farhadi) has launched Molmo, a family of four open-source multimodal models. While advanced models “can perceive the world and communicate with us, Molmo goes beyond that to enable one to act in their worlds, unlocking a whole new generation of capabilities, everything from sophisticated web agents to robotics,” according to Ai2. On some third-party benchmark tests, Molmo’s 72 billion parameter model outperforms other open AI offerings and “performs favorably” against proprietary rivals like OpenAI’s GPT-4o, Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet, Ai2 says. Continue reading Allen Institute Announces Vision-Optimized Molmo AI Models

Meta Unveils New Open-Source Multimodal Model Llama 3.2

Meta’s Llama 3.2 release includes two new multimodal LLMs, one with 11 billion parameters and one with 90 billion — considered small- and medium-sized — and two lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices. Included are pre-trained and instruction-tuned versions. In addition to text, the multimodal models can interpret images, supporting apps that require visual understanding. Meta says the models are free and open source. Alongside them, the company is releasing “the first official Llama Stack distributions,” enabling “turnkey deployment” with integrated safety. Continue reading Meta Unveils New Open-Source Multimodal Model Llama 3.2

Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

Alibaba Cloud last week globally released more than 100 new open-source variants of its large language foundation model, Qwen 2.5, to the global open-source community. The company has also revamped its proprietary offering as a full-stack AI-computing infrastructure across cloud products, networking and data center architecture, all aimed at supporting the growing demands of AI computing. Alibaba Cloud’s significant contribution was revealed at the Apsara Conference, the annual flagship event held by the cloud division of China’s e-retail giant, often referred to as the Chinese Amazon. Continue reading Alibaba Cloud Ups Its AI Game with 100 Open-Source Models

Alibaba’s Latest Vision Model Has Advanced Video Capability

China’s largest cloud computing company, Alibaba Cloud, has released a new computer vision model, Qwen2-VL, which the company says improves on its predecessor in visual understanding, including video comprehension and text-to-image processing in languages including English, Japanese, French, Spanish, Chinese and others. The company says it can analyze videos of more than 20 minutes in length and is able to respond appropriately to questions about content. Third-party benchmark tests compare Qwen2-VL favorably to leading competitors and the company is releasing two open-source versions with a larger private model to come. Continue reading Alibaba’s Latest Vision Model Has Advanced Video Capability

OSI Aims for Industry Standard by Defining ‘Open Source AI’

Creating a universal definition of “open source AI” has generated a fair amount of debate and confusion, with many outfits using elastic parameters in order to achieve a fit. Now the Open Source Initiative (OSI) — “the authority that defines Open Source” — has issued what it hopes will become the baseline definition. That definition, which includes the ability to “use the system for any purpose and without having to ask for permission,” excludes a lot of AI platforms that currently describe themselves as “open,” many freely available only for non-commercial use. OSI’s remaining three parameters involve the ability to inspect the system and modify and share it. Continue reading OSI Aims for Industry Standard by Defining ‘Open Source AI’

Meta, Oxford Advance 3D Object Generation with VFusion3D

VFusion3D is the latest AI model unveiled by Meta Platforms, which developed it in conjunction with the University of Oxford. The powerful model, which uses single-perspective images or text prompts to generate high-quality 3D objects, is being hailed as a breakthrough in scalable 3D AI that can potentially transform sectors including VR, gaming and digital design. The platform tackles the challenge of scarce 3D training data in a world teeming with 2D images and text descriptions. The VFusion3D approach leverages what the developers call “a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models.” Continue reading Meta, Oxford Advance 3D Object Generation with VFusion3D

Black Forest Labs Announces Suite of Text-to-Image Models

A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models

Nvidia Debuts New Products to Accelerate Adoption of GenAI

After 50 years of SIGGRAPH, the conference has come full circle, from high-tech for PhDs to AI for everyone. That was Nvidia founder and CEO Jensen Huang’s message in back-to-back keynote sessions, including a Q&A with Meta CEO Mark Zuckerberg. Huang touted Universal Scene Description (OpenUSD), discussing developments aiming to speed adoption of the universal 3D data interchange framework for use in everything from robotics to the creation of “highly accurate virtual worlds for the next evolution of AI.” As Zuckerberg’s interlocutor, he prompted the Facebook founder to share a vision of AI’s personalization of social media. Continue reading Nvidia Debuts New Products to Accelerate Adoption of GenAI

Stable Video 4D Adds Time Dimension to Generative Imagery

Stability AI has unveiled an experimental new model, Stable Video 4D, which generates photorealistic 3D video. Building on what it created with Stable Video Diffusion, released in November, this latest model can take moving image data of an object and iterate it from multiple angles — generating up to eight different perspectives. Stable Video 4D can generate five frames across eight views in about 40 seconds using a single inference, according to the company, which says the model has “future applications in game development, video editing, and virtual reality.” Users begin by uploading a single video and specifying desired 3D camera poses. Continue reading Stable Video 4D Adds Time Dimension to Generative Imagery

Mistral, Nvidia Bring Enterprise AI to Desktop with NeMo 12B

Nvidia and French startup Mistral AI are jointly releasing a new language model called Mistral NeMo 12B that brings enterprise AI capabilities to the desktop without the need for major cloud resources. Developers can easily customize and deploy the new LLM for applications supporting chatbots, multilingual tasks, coding and summarization, according to Nvidia. “NeMo 12B offers a large context window of up to 128k tokens,” explains Mistral, adding that “its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category.” Available under the Apache 2.0 license, it is easy to implement as a drop-in replacement for Mistral 7B. Continue reading Mistral, Nvidia Bring Enterprise AI to Desktop with NeMo 12B

Apple Launches Public Demo of Its Multimodal 4M AI Model

Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model

Nvidia’s Open Models to Provide Free Training Data for LLMs

Nvidia is expanding its substantive influence in the AI sphere with Nemotron-4 340B, a family of open models designed to generate synthetic LLM training data for commercial applications across numerous fields. Through what Nvidia is calling a “uniquely permissive” free open model license, Nemotron-4 340B provides a scalable way for developers to build LLMs. Synthetic data is artificially generated data designed to mimic the characteristics and structure of data found in the real world. The offering is being called “groundbreaking” and an important step toward the democratization of artificial intelligence. Continue reading Nvidia’s Open Models to Provide Free Training Data for LLMs