Meta’s Llama 3.3 Delivers More Processing for Less Compute

By Paula Parisi
December 10, 2024

Meta Platforms has packed more artificial intelligence into a smaller package with Llama 3.3, which the company released last week. The open-source large language model (LLM) “improves core performance at a significantly lower cost, making it even more accessible to the entire open-source community,” Meta VP of Generative AI Ahmad Al-Dahle wrote on X social. The 70 billion parameter text-only Llama 3.3 is said to perform on par with the 405 billion parameter model that was part of Meta’s Llama 3.1 release in July, with less computing power required, significantly lowering its operational costs.

The multilingual model is available at Meta’s Llama.com and on Hugging Face, offered under the a Community License Agreement granting non-exclusive, royalty-free use, reproduction, distribution and modification.

“While the license is generally free, organizations with over 700 million monthly active users must obtain a commercial license directly from Meta,” VentureBeat reports, delving into the question of how much savings does Llama 3.3 actually offer?

Users can realize “nearly 1940 GB worth of GPU memory, or potentially, achieve 24 times reduced GPU load for a standard 80 GB Nvidia H100 GPU,” which translates to roughly “$600,000 in up-front GPU cost savings, potentially — not to mention the continuous power costs” at “an estimated $25,000 per H100 GPU,” notes VentureBeat.

Al-Dahle’s X post includes “a chart showing Llama 3.3 70B outperforming Google’s Gemini 1.5 Pro, OpenAI’s GPT-4o, and Amazon’s newly released Nova Pro on a number of industry benchmarks, including MMLU, which evaluates a model’s ability to understand language,” TechCrunch points out, adding that Meta says “the model should deliver improvements in areas like math, general knowledge, instruction following, and app use.”

With that benefit come some requirements: attribution along the lines of “built with Llama” is a must for those who integrate Llama 3.3 in their products and services, while an “Acceptable Use Policy” provision prohibits generation of harmful content or illegal content.

Meta’s latest model release is part of a “play to dominate the AI field with ‘open’ models that can be used and commercialized for a range of applications,” TechCrunch writes. “For many, it’s immaterial that Llama models aren’t ‘open’ in the strictest sense” (or according to the definition set forth by the Open Source Initiative). The proof is in Meta’s assertion that Llama models have received more than 650 million downloads.

According to a GitHub model card, Llama 3.3 has been pretrained using “data from publicly available sources,” and fine-tuned using “publicly available instruction datasets, as well as over 25M synthetically generated examples.” The pretraining data has a cutoff of December 2023.

Meta’s Llama 3.3 Delivers More Processing for Less Compute

No Comments Yet

Leave a comment