AMD Laid The Groundwork For Big Hyperscaler AI Accelerator Play In The Future

By Patrick Moorhead - July 12, 2023

Yesterday AMD held an in-person event in San Francisco to cover the latest in its datacenter and AI progress. It made at least eight important and meaty disclosures across its Epyc, Pensando, and Instinct lines. I won’t attempt to cover them all, and I think I should focus on what the company spent about a third of the program on, which was AI. Matt Kimball, our Compute Principal Analyst, will drill into Epyc this later in the week

While most of the AI content was about AI accelerators for the hyperscale datacenter, I believe looking at the long-term totality of the AI opportunity is important. Ironically, most AI inference gets processed on CPUs today, and I don’t see that changing any time soon as those providers add fixed function blocks to support specific AI inference workloads. From an end-platform standpoint, AI spans from the smallest IoT device to smartphones to PCs, the datacenter edge, and the largest hyperscale datacenters. That AI will be spread across CPU, GPU, FPGA and NPUs up and down the value chain.

AMD has AI capabilities now in its 4th Gen Epyc processors (DC & edge), Versal AI SoCs (edge), Ryzen 7040 processors (notebooks), Alveo accelerator cards (DC & edge) and, of course, Instinct accelerators (HPC & DC). Any way to slice it, I believe AMD will get a lift from AI. It’s an inevitability.

Now let’s dive into AMD’s hyperscale datacenter AI content.

AMD made the following datacenter AI accelerator disclosures yesterday:

  1. AMD Instinct MI300X- New card announced for LLM inference in the hyperscale datacenter with 192GB HBM3, sampling Q3. The company claimed it provides 2.4X HBM density and 1.6X HBM bandwidth versus the NVIDIA H100, and the company says it can do a lot more work with less GPUs than an 80GB H100. I am wondering how the one card 192GB MI300X performs versus the two card 188GB H100 NVL. My guess is AMD can’t get ahold of one yet. AMD showed the first public demo of the MI300X running Hugging Face’s 40B parameter Falcon model on one card. This was impressive, to say the least.
  2. AMD Instinct MI300A- Targeted to HPC markets, the accelerator includes an Instinct GPU + Ryzen CPU with 128GB HBM3 and is sampling now.
  3. AMD Infinity Architecture Platform- Eight MI300X cards for training and inference for LLMs with 1.5TB of HBM3.
  4. AMD has large AI deployments: Azure (in Explorer), LUMI, and Korea Telecom. I was already aware of the Frontier supercomputer.
  5. AI Strategy- “open” (software), “proven” (AI capabilities), and “ready” (support today for AI models)

While AMD may have not satisfied the shortest-term investors yesterday, what AMD’s disclosures showed me is that the company is making the investments to compete with NVIDIA long-term in the hyperscale cloud AI accelerator market that easily has a $125B TAM by 2028. To think that AMD would deliver some knockout blow yesterday to NVIDIA would show that you don’t understand how technology, ecosystems or the market operates. Yesterday’s announcements were what I consider step three in a ten-step AMD plan.

Forbes Daily: Get our best stories, exclusive reporting and essential analysis of the day’s news in your inbox every weekday.

The fact is that AMD has had success with its Instinct cards in areas where end customers write their own software- the highest performance computing installations, like Frontier. That’s an outlier, though. NVIDIA has built a giant AI moat with CUDA and its libraries for those who don’t want to write to the metal. ROCm is AMD’s AI software stack spanning AI models and algorithms, libraries, compilers, and tools, and of course, base drivers for Instinct. ROCm has been around for a long time. In fact, my company wrote a white paper on it seven years ago when it was focused on HPC, not datacenter AI. I have always believed AMD could field competitive AI hardware and always had questions on the software. I will become a bigger believer for ROCm in the hyperscale datacenter AI when a large hyperscaler provides public support for running production workloads. I do believe this will eventually happen. Why?

AMD ROCm Optimized AI Software Stack

Cloud hyperscalers appreciate that NVIDIA has brought many of these AI capabilities to them first, but they’re uncomfortable with the lack of a broad supply chain and choice.

Long-term, the cloud players can either use their own accelerators (i.e., AWS Inferentia and Trainium, Google TPU, Meta MTIA V1 ccelerators), work to create a second player like AMD (or Intel) or continue to sole source from NVIDIA. I think large CSPs will do all three.

GPUs, unlike the ASICs from AWS, Google and Meta are much more programmable which is why GPUs still rule the roost holistically across a wide variety of AI even though ASICs are lower power with higher performance. I believe AMD, if it passes muster with its Q3 sampling, could pick up incremental AI business starting end of Q4 2023 to Q1 2024. And for what it is worth, AI framework providers like Pytorch and foundational model providers like Hugging Face want more GPU vendors, too, to enhance their innovation.

NVIDA has at least 95% market share in the hyperscale datacenter AI accelerator space that AMD characterizes as a $30B TAM today moving to $150B by 2027. So even if AMD captures 20% of this market by 2027 driving $30B in revenue, it still means there’s still massive growth for NVIDIA. Both companies can win in the hyperscale datacenter AI accelerator space. Anybody who thinks this will be a winner take all scenario doesn’t understand technology, ecosystems or, for that matter, business logic. If all that doesn’t work, as we have seen with Microsoft in 1998, Google and Amazon this decade, global regulatorors show up. But I don’t think we will get there. I believe AMD can capture 10-20% of this market bt 2027. First, AMD needs a major CSP or hyperscaler to show up with big support that leads to big volumes and revenue. If not, all bets are off.


Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.