GTC: Nvidia Unveils Blackwell GPU for Trillion-Parameter LLMs

Nvidia unveiled what it is calling the world’s most powerful AI processing system, the Blackwell GPU, purpose built to power real-time generative AI on trillion-parameter large language models at what the company says will be up to 25x less cost and energy consumption than its predecessors. Blackwell’s capabilities will usher in what the company promises will be a new era in generative AI computing. News from Nvidia’s GTC 2024 developer conference included the NIM software platform, purpose built to streamline the setup of custom and pre-trained AI models in a production environment, and the DGX SuperPOD server, powered by Blackwell.

The Blackwell B200 GPU and GB200 “superchips” replace the Hopper architecture H100 and H200 GPUs that became the chips of choice for just about every tech firm serious about AI. Nvidia’s announcement includes testimonials by everyone from Meta’s Mark Zuckerberg to OpenAI’s Sam Altman, Oracle’s Larry Ellison and even Elon Musk.

“Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution,” said Nvidia founder and CEO Jensen Huang, who delivered the GTC keynote. “Working with the most dynamic companies in the world, we will realize the promise of AI for every industry.”

Manufactured using a custom 4 nanometer TSMC process, the Blackwell architecture GPUs feature 208 billion transistors and two dies connected by a 10 TB/s interconnect to function as a single GPU.

“Nvidia says the new B200 GPU offers up to 20 petaflops of FP4 horsepower from its 208 billion transistors,” writes The Verge, adding “a GB200 that combines two of those GPUs with a single Grace CPU can offer 30 times the performance for LLM inference workloads.”

By way of example, The Verge said “training a 1.8 trillion parameter model would have previously taken 8,000 Hopper GPUs and 15 megawatts of power,” whereas “2,000 Blackwell GPUs can do it while consuming just four megawatts,” according to Huang.

The DGX line “is one of Nvidia’s primary server hardware and cloud systems,” supporting Nvidia’s full AI software stack, per VentureBeat. “The DGX SuperPOD integrates the GB200 superchip version of the Blackwell, which includes both CPU and GPU resources.” VentureBeat notes “the system can deliver 240 terabytes of memory” and “has 11.5 exaflops of AI supercomputing power.”

With NIM, Nvidia is aiming “to create an ecosystem of AI-ready containers that use its hardware as the foundational layer with curated microservices as the core software layer for companies that want to speed up their AI roadmap,” writes TechCrunch, saying that will streamline a process that normally takes developers weeks or months.

NIM currently supports Nvidia’s own proprietary models as well as those from A121, Adept, Cohere, Getty Images and Shutterstock, and also open models from Google, Hugging Face, Meta, Microsoft, Mistral AI and Stability AI, TechCrunch reports.

Related:
Nvidia GB200 NVL72 Delivers Trillion-Parameter LLM Training and Real-Time Inference, Nvidia, 3/18/24
Nvidia NIM Offers Optimized Inference Microservices for Deploying AI Models at Scale, Nvidia, 3/18/24
Nvidia Unveils 6G Research Cloud Platform to Advance Wireless Communications with AI, Nvidia, 3/18/24
Nvidia Joins Ongoing Race in Quantum-Computing Cloud Services, Bloomberg, 3/18/24
Nvidia Launches Cloud Quantum-Computer Simulation Microservices, Nvidia, 3/18/24
Nvidia Powers Japan’s ABCI-Q Supercomputer for Quantum Research, Nvidia, 3/18/24
Nvidia Digital Human Technologies Bring AI Characters to Life, Nvidia, 3/18/24
Nvidia Enlists Humanoid Robotics’ Biggest Names for New AI Platform GR00T, TechCrunch, 3/18/24
Nvidia’s Jensen Huang Says AI Hallucinations Are Solvable, TechCrunch, 3/19/24
Nvidia CEO Wants Enterprise to Think ‘AI Factory,’ Not Data Center, TechCrunch, 3/19/24

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.