Analyzing Microsoft’s Datacenter Silicon Announcements At Ignite 2023

By Matt Kimball, Patrick Moorhead - December 5, 2023

Microsoft started its annual Ignite event by announcing two silicon products to be implemented in its cloud platform, Azure, and across some of its SaaS properties. The Azure Maia AI 100 accelerator is an AI application-specific integrated circuit targeting training and inference workloads. The Azure Cobalt 100 CPU is a general-purpose cloud chip that’s built on the Arm architecture. Both pieces of silicon further extend the trend of in-house silicon design that enables cloud service providers to tailor silicon for their specific datacenters and their specific technology stacks.

This article is co-written with Patrick Moorhead, CEO and chief analyst of Moor Insights & Strategy. Let’s dig into the details of Microsoft’s announcement in the following few sections.

Full disclosure: Microsoft has a paid analysis engagement with Moor Insights & Strategy, as do Amazon Web Services, Google Cloud, Oracle Cloud, IBM Cloud, Intel, Nvidia and AMD.

Is Custom Silicon Part Of A Bigger Cloud Trend?

Custom silicon isn’t the future of the cloud—it’s the present. Like its competitors AWS and Google Cloud, Microsoft is trying to deliver AI as a faster and cheaper service than the competition. Or at least to reach parity. We have believed for a while now that Azure and hence Microsoft SaaS services such as Microsoft 365 and Dynamics 365 have been at a strategic disadvantage without efficient and performant custom silicon.

Microsoft needed better control of the entire stack—from the racks and servers to the silicon to the OS and software. The company has already made many optimizations up the stack; datacenter silicon was perhaps the last big optimization knob that could be turned.

Microsoft’s Arm Journey Started Long Ago

The Ignite announcements, while surprising, should not come as a shock to anybody who follows Microsoft, the cloud or silicon in general, for all the reasons outlined above. In particular, the company has been flirting with Arm for some time in both its client and server divisions and has spent significant resources optimizing the Windows OS to run on Arm chips.

While the open-source community has long supported the Arm architecture, two events accelerated its adoption: the launch of Arm’s Neoverse as a datacenter-first architecture and AWS’s acquisition of Annapurna Labs. The first generation of Neoverse was important because it demonstrated Arm’s commitment to the datacenter. AWS then harnessed Annapurna’s technology to launch Nitro as a way to offload functions that would otherwise steal valuable resources away from the expensive host processor. AWS’s launch of its Graviton CPU—also developed with Annapurna expertise—elevated Arm to first-citizen status for a general-purpose CPU in the open-source community. When the largest cloud provider deploys an architecture at scale, ISVs and contributors to open-source projects take note.

The success of Graviton surely motivated Microsoft to speed up its own Arm strategy. In 2022, the company announced a partnership with Arm CPU vendor Ampere to deploy its CPUs in Azure to support scale-out for cloud-native workloads. And 15 months later, the company is deploying its own silicon.

Maia 100 Deeper Dive

Maia is an ASIC designed to support large language model training and inferencing. It is designed on TSMC’s latest 5nm process and supports sub-8-bit data types built on the MX open standard. The MX partnership with other silicon players (Nvidia, AMD, Intel, Qualcomm, Arm and Meta) enables faster hardware development and faster AI training and inferencing. Click here to learn more about this Open Compute Project alliance.

Maia was tested with Open AI GPT-3.5 and is currently being tested with Bing Chat and GitHub Copilot. While performance numbers have not been published, the company did say it was focused on delivering compelling benefits in terms of performance per dollar and TCO.

Microsoft’s competitors have deployed their own ASICs for AI. AWS launched its Trainium chip in 2022 after bringing out the second version of Inferentia, a chip for AI inferencing that it first introduced in 2018. Meanwhile, Google’s Tensor Process Unit was also made available for customers all the way back in 2018.

Azure can deploy up to four Maia chips in a server. While we don’t yet have any baseline for performance, this appears to be a significant footprint. To support this configuration, Microsoft developed Sidekicks, a liquid-cooling solution that can be quickly fitted to existing racks without what Microsoft considers any considerable retrofitting.

Cobalt 100 Deeper Dive

Microsoft’s Cobalt 100 CPU is a chip with 128 single-threaded cores that supports the Arm instruction set and is tailored for cloud-native and other workloads running in Azure. As of the announcement, Cobalt is already powering Teams, Azure SQL and other services that run on 1P servers. As with Maia, this chip is built on TSMC’s 5nm process and is designed to deliver the best performance per dollar. Microsoft claims up to 40% better performance per dollar than its current Arm deployment with Ampere. Note that the comparison is to the first-generation Ampere part, not the newest AmpereOne product.

We expect Microsoft to deploy Cobalt at scale rapidly. Because the company used Arm’s compute subsystems (CSS), it could more quickly develop Cobalt with confidence about support for the software ecosystem. CSS is a program through which companies can take pre-validated Neoverse N2 silicon and tweak it for their specific purposes and environments. In the case of a general-purpose CPU such as Cobalt, those tweaks focus on increasing power efficiency.

What This Means For The Merchant Silicon Providers

Both Maia and Cobalt will likely have an impact on Microsoft’s silicon partners at current course and speed. As expected—and like Google, AWS and Oracle—Microsoft partners with all CPU and GPU makers such as AMD, Intel and Nvidia as merchant silicon providers to deliver choice to its customers. Azure deploys AMD, Intel and Ampere on the CPU front. And while AMD and Intel will continue to have utility for HPC, SAP database deployments and other specific functions, I cannot see a scenario where Ampere will continue to be an Azure partner. I suspect Microsoft deployed Ampere understanding that it had its own part coming out in the not-too-distant future. So the Arm footprint is maybe not as big.

Like Cobalt, Maia will be rapidly deployed at scale across Azure. And I expect Microsoft to continue to drive performance optimizations. While it is difficult to predict how much customers will demand and adopt Maia for their own training purposes, Microsoft’s support of OpenAI certainly helps with market acceptance.

Closing Thoughts

Microsoft’s entry into the silicon market should be a win for Azure. Tailoring the entire AI and cloud stacks to deliver the best performance at the lowest cost naturally benefits both customers and shareholders.

That said, does this move give Azure a significant competitive advantage over AWS or Google Cloud? As it relates to Cobalt, I don’t believe so. While the experience of customers should be improved, the real winner is Azure, as it will realize measurable cost savings.

Maia is a little more interesting. While Azure in many ways is only achieving parity with its competitors by delivering an ASIC for AI training, OpenAI support also helps this first-generation piece of silicon in maturity and adoption. While I don’t believe Maia delivers any kind of blow to the competition, it certainly benefits Azure customers.

Regardless of the competitive positioning, cloud providers designing and deploying custom silicon to drive optimal performance and lower costs can be a big benefit to the market. As long as these tailored solutions aren’t anchored with cloud provider lock-in.

Matthew Kimball
+ posts

Matt Kimball is a Moor Insights & Strategy senior datacenter analyst covering servers and storage. Matt’s 25 plus years of real-world experience in high tech spans from hardware to software as a product manager, product marketer, engineer and enterprise IT practitioner.  This experience has led to a firm conviction that the success of an offering lies, of course, in a profitable, unique and targeted offering, but most importantly in the ability to position and communicate it effectively to the target audience.

Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.