Nvidia’s Impressive AI Model Could Compete with Top Brands

Nvidia has debuted a new AI model, Llama-3.1-Nemotron-70B-Instruct, that it claims is outperforming competitors GPT-4o from OpenAI and Anthropic’s Claude 3.5 Sonnet. The impressive showing has prompted speculation of an AI shakeup and a significant shift in Nividia’s AI strategy, which has thus far been focused primarily on chipmaking. The model was quietly released on Hugging Face, and Nvidia says as of October 1 it ranked first on three top automatic alignment benchmarks, “edging out strong frontier models” and vaulting Nvidia to the forefront of the LLM field in areas like comprehension, context and generation.

Llama-3.1-Nemotron-70B-Instruct’s strong test results include a 85.0 on Arena Hard, 57.6 on AlpacaEval 2.0 LC and  8.98 on GPT-4 Turbo MT-Bench, according to Nvidia’s post on Hugging Face.

The model release marks “a pivotal moment for Nvidia, known primarily as the dominant force in graphics processing units (GPUs) that power AI systems” and now demonstrating prowess in the ability to “develop sophisticated AI software,” VentureBeat reports.

In developing Llama-3.1-Nemotron-70B-Instruct, Nvidia fine-tuned Meta Platforms’ open-source Llama 3.1 model using training techniques such as reinforcement learning from human feedback (RLHF), which allows AI to glean information from human preferences, with the aim of  more natural responses that are attuned to context.

“The Llama ‘herd’ of AI models, as Meta refers to them, are meant to be used as open-source foundations for developers to build on,” writes Cointelegraph, adding that “the ‘Nemotron’ portion of the model’s name encapsulates Nvidia’s contribution to the end result.” Nvidia introduced Nemotron, a family of open models, in June.

The result “has the potential to offer businesses a more capable and cost-efficient alternative to some of the most advanced models on the market,” VentureBeat notes, explaining that “the model’s ability to handle complex queries without additional prompting or specialized tokens is what sets it apart.”

“Use our customized model Llama-3.1-Nemotron-70B to improve the helpfulness of LLM generated responses in your applications,” Nvidia posted on X.

The Nemotron-70B model “scores well across all four categories: Chat, Chat-Hard, Safety, and Reasoning,” Nvidia explains in a blog post, adding that “it has an impressive performance for Safety and Reasoning, achieving 95.1 percent and 98.1 percent accuracy, respectively. This means that the model can safely reject potential unsafe responses and support RLHF in domains like math and code.”

Nvidia offers more information here.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.