Cerebras Systems, a leader in accelerating generative AI, has unveiled six new AI inference datacenters powered by its groundbreaking Wafer-Scale Engines. These cutting-edge facilities, equipped with thousands of Cerebras CS-3 systems, will process over 40 million Llama 70B tokens per second. With this expansion, Cerebras cements its position as the world’s top provider of high-speed AI inference and the largest domestic high-speed inference cloud.
“These new datacenters mark a significant milestone in our 2025 AI inference scaling strategy,” said Dhiraj Mallick, COO of Cerebras Systems. “We are dramatically increasing our capacity by 20x to meet the surging demand for AI inference, ensuring businesses and research institutions have access to the most advanced AI infrastructure available.”
AI Authority Trend: Gcore Enhances Everywhere Inference with Flexible Cloud, On-Premise, and Hybrid Deployment Options
Cerebras AI Inference Data Centers:
- Santa Clara, CA (Online)
- Stockton, CA (Online)
- Dallas, TX (Online)
- Minneapolis, MN (Q2 2025)
- Oklahoma City, OK (Q3 2025)
- Montreal, Canada (Q3 2025)
- Midwest/Eastern US (Q4 2025)
- Europe (Q4 2025)
The Oklahoma City and Montreal datacenters will feature AI hardware exclusively owned and operated by Cerebras, while the remaining locations will be jointly managed with strategic partner G42. With 85% of its total capacity based in the United States, Cerebras is solidifying the nation’s leadership in AI infrastructure and technological innovation.
Powering the Future of AI
Since launching its high-speed AI inference offering in August 2024, Cerebras has seen a surge in demand from leading AI companies. Mistral, France’s top AI startup, relies on Cerebras to power its flagship Le Chat AI assistant. Perplexity, the world’s leading AI search engine, leverages Cerebras to provide instant search results. Additionally, Hugging Face and AlphaSense recently announced their adoption of Cerebras’ industry-leading inference capabilities.
“Mallick added, “Cerebras is shaping the future of AI with unparalleled performance, scalability, and efficiency.. “With six new datacenters, we are meeting the global demand for AI inference, ensuring access to sovereign, high-performance AI infrastructure that fuels critical research and business transformation.”
Oklahoma City and Montreal Lead the Way
Set to go live in June 2025, the Scale Datacenter in Oklahoma City will house over 300 Cerebras CS-3 systems. Designed as a Level 3+ computing facility, it boasts tornado- and seismically-shielded architecture, triple redundant power stations, and state-of-the-art water-cooling technology, making it one of the most resilient data centers in the United States.
“We’re excited to join forces with Cerebras to bring top-tier AI infrastructure to Oklahoma City,” said Trevor Francis, CEO of Scale Datacenter. “This collaboration highlights our commitment to empowering AI innovation and supporting next-generation applications.”
By July 2025, the Enovum Montreal facility will be fully operational, bringing Wafer-Scale inference technology to Canada for the first time. A division of Bit Digital, Inc., Enovum operates the Montreal datacenter, offering Canadian enterprises, government agencies, and research institutions inference speeds 10x faster than current GPU solutions.
AI Authority Trend: PaleBlueDot AI Launches Dot-1.1: First AI Cloud Agent for Next-Gen Computing
“At Enovum, we’re proud to partner with Cerebras, a true leader in AI innovation,” said Billy Krassakopoulos, CEO of Enovum Data Centers. “This partnership will accelerate AI advancements in Canada, providing high-performance colocation solutions tailored for next-generation workloads.”
Setting a New Standard for AI Inference
Advanced reasoning models like DeepSeek R1 and OpenAI o3 often take minutes to generate responses. Cerebras tackles this challenge by increasing inference speeds by 10x, delivering near-instant results. With hyperscale capacity coming online in Q3 2025, Cerebras is poised to become the market leader in real-time AI inference.
FAQs
1. What makes Cerebras’ AI inference datacenters unique?
Cerebras’ datacenters use proprietary Wafer-Scale Engine technology, enabling unmatched AI inference speed, scalability, and energy efficiency compared to traditional GPU-based infrastructure.
2. When will the new Cerebras datacenters be operational?
Three datacenters in Santa Clara, Stockton, and Dallas are already online. Additional sites in Minneapolis, Oklahoma City, Montreal, and other locations will launch between Q2 and Q4 2025.
3. How does Cerebras’ AI infrastructure benefit businesses and researchers?
By accelerating inference speeds up to 10x, Cerebras allows enterprises, AI startups, and research institutions to deploy cutting-edge AI models faster, improving efficiency and innovation across industries.
AI Authority Trend: Lambda Secures $480 Million to Expand Its AI Cloud Platform
To share your insights, please write to us at news@intentamplify.com





