Cerebras Is Moving into Mainstream with New AI Data Centers
March 17, 2025
Cerebras Systems was founded 10 years ago on the belief that there would be a shortage of processors powerful enough to drive enterprise AI computing at scale. Its solution, the Cerebras Wafer-Scale Engine, is integrated into Cerebras’ CS-3 systems, which will power six new data centers launching this year that the company says will make it “the world’s number one provider of high-speed inference and the largest domestic high speed inference cloud.” Cerebras notes the new facilities will collectively serve over 40 million Llama 70B tokens per second to clients that now include Hugging Face and financial intelligence firm AlphaSense.
Cerebras explains its Wafer-Scale Engine (WSE-3) processor “can run AI models 10 to 70 times faster than GPU-based solutions,” a speed advantage that “has become increasingly valuable as AI models evolve toward more complex reasoning capabilities,” VentureBeat writes.
The company, which has been preparing for an IPO since September, already has three data centers online: in Santa Clara and Stockton, California, as well as Dallas, Texas. It will open a site in Minneapolis in Q2, and the following quarter in Oklahoma City, Oklahoma, and Montreal, Canada. Q4 will see sites doing business in the Eastern and Midwest U.S., as well as Europe, per its announcement.
Each facility is equipped with thousands of CS-3 systems, representing a performance that is orders of magnitude beyond that which is provided by high-end commercial data center networks, where Nvidia processors dominate. Unlike other big commercial data centers — Microsoft’s Azure, Amazon’s AWS and Google Cloud — which maintain a mixed workload, Cerebras’ data centers will focus exclusively on inference.
While those companies each have between 100 and 135 data centers worldwide, boutique-scaled Cerebras is throwing down a marker that it wants to own the inference space. The company is backed by top venture capital firms — including Benchmark, Foundation Capital, and Alpha Wave Ventures. It is targeting an IPO valuation of $7-8 billion.
Cerebras’ data center expansion “could be bad news for Nvidia,” writes VentureBeat. Until now, Cerebras’ wafer-sized chips (the WSE-3 packs 900,000 AI cores onto each processor) have been limited to specialty customers — national labs, large enterprises, and the data centers that serve them. Known clients include the Argonne National Laboratory, Mayo Clinic and GlaxoSmithKline, who use the chips to train models with trillions of parameters.
A single WSE-3 is believed to cost $2-3 million dollars, whereas Nvidia’s high-end Blackwell chips cost no more than $50,000 each, and the popular H100s $30-40,000.
A typical CS-3 system could cost $7 million, potentially double that with maxed-out memory. The Oklahoma City facility alone will house more than 300 CS-3 systems, reports The Decoder. “CS-3s are quickly and easily clustered together to make the largest AI supercomputers in the world, and make placing models on the supercomputers dead simple by avoiding the complexity of distributed computing,” the company announcement explains.
Cerebras’ data center inference pricing does not appear to directly correlate to the cost of CS-3 systems. Last year the company said its inference services were “a fraction of the cost of hyperscale and GPU clouds.” The deal with Hugging Face makes Cerebras inference available with a click to the platform’s five million developers, a strategic marketing move as the company seeks to build a commoditized service prior to its IPO.
“This year, our goal is to truly satisfy all the demand and all the new demand we expect will come online as a result of new models like [Meta Platforms’] Llama 4 and new DeepSeek models,” Cerebras Director of Product Marketing James Wang tells VentureBeat.
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.