AMD Next Horizon with AMD President/CEO Dr. Lisa Su. PATRICK MOORHEAD
In August 2016, when AMD announced at an event in San Francisco that its Zen architecture would provide a 40% IPC improvement, I will admit, I was skeptical. I was hopeful as the industry needed more competition but skeptical as the company hadn’t had much commercial success in CPUs for years. Eight years, to be exact. A little over two years later, few could imagine that AMD would be fielding an incredibly competitive and differentiated field of processors for desktops, notebooks and for the datacenter server. A little over two years later, last week, I attended an AMD event in San Francisco that provided more details on its next-generation datacenter processors based on 7nm. Here’s my rundown of everything that was announced, and my take on it.
EPYC availability on AWS
AWS’s Matt Garman and AMD CEO Lisa Su discuss the three new instances. PATRICK MOORHEAD
One of the biggest pieces of news was AMD and Amazon.com AWS’s joint announcement of the immediate release of the first EPYC-based instances on AWS’s popular Elastic Compute Cloud (EC2). AMD says these instances offer “industry-leading core density and memory bandwidth,” giving the potential for significant performance per dollar cost savings.
Matt Garman, AWS’s Compute VP, substantiated AMD’s claims by saying AMD delivers security, reliability, and performance. The R5 and M5 instances are now available in select regions, with “no software or script change”, and can be accessed through the AWS Management Console and AWS Command Line Interface. AMD says the EPYC-based T3 instances will be available in the coming weeks and the company heralds this announcement as a milestone in EPYC’s growing adoption. It’s hard to argue with that.
This announcement could be the biggest one of the year for AMD. These AWS instances aren’t the less popular instances like on Microsoft Azure and Oracle Cloud, these are very popular instances from the #1 public cloud provider. It’s black and white. I was also struck that AWS’s Garman said in very plain terms that its customers can expect to get the same performance for a 10% discount and that no software changes were needed. I will be interested to see if Azure gets more aggressive with AMD or if Google GCP signs up.
The first 7nm datacenter GPUs
Another big announcement at the event was the unveiling of the AMD Radeon Instinct M160 and M150 accelerators, which AMD accurately calls “the world’s first 7nm datacenter GPUs.” These GPUs are based on AMD’s Vega architecture and geared towards powering HPC, deep learning, cloud computing, VDI, and rendering applications. They are optimized to perform deep learning capabilities, equipped to tackle everything from training complex neural networks to running inference against those networks.
AMD GPU lead David Wang announces Instinct cards. PATRICK MOORHEAD
To process these demanding workloads, AMD says the new accelerators enable “ultra-fast” floating-point performance and feature HBM2 memory (with impressive speeds up to 1TB/s). The M160 features 32 GB of HBM2 ECC memory, while the M150 sports 16GB. As far as connectivity goes, these accelerators are the first to offer support for PCIe 4.0 interconnect. AMD says the M160 is the world’s fastest double precision PCIe 4.0 capable accelerator, boasting peak FP64 performance up to 7.4 TFLOPS. The M150, meanwhile, maxes out at 6.7 TFLOPS, providing a more economical solution for intensive workloads. These accelerators also feature two AMD Infinity Fabric Links apiece, which AMD says can deliver data as much as 6 times faster than PCIe 3.0 by itself. Additionally, these links purport to allow customers to connect up to 4 GPUs in a hive-ring configuration using Infinity Fabric.
With these new datacenter GPUs, I can see AMD having some degree of success in VDI and cloud gaming. In VDI, AMD provides hardware-based GPU virtualization, not requiring costly software as with other implementations. The accelerators include Secure Virtualized Workload Support, through AMD’s MxGPU hardware-based GPU virtualization solution. I believe AMD’s close relationships with Microsoft and Sony with Xbox and the PlayStation give it a leg up in these markets, too.
As for AI, deep learning and machine learning, its harder to predict how AMD will do, as there are many moving parts. AMD is comparing its “to be shipped” cards versus cards NVIDIA has been shipping for a while. Also, AI, ML and DL workloads vary widely. Training is very different from inference and 64, 32, 16, 8- and 4-bit performance and efficiency vary, particularly where some have ML ASICs. AMD has typically had competitive hardware for many workloads, but software matters as much as hardware.
At the event, AMD demonstrated its further embrace of open source with these new accelerators, with the announcement of a new version of the ROCm open software platform for accelerated computing. ROCm 2.0 features new math libraries, optimized deep learning operations, broader software support, and will support both the M150 and M160 accelerators. AMD says the new edition of ROCm will enable customers to “deploy high-performance, energy-efficient heterogeneous computing systems in an open environment.” Google was quoted supporting AMD, but it struck me as more of support for open source software than ROCm.
I am waiting for that “Zen-moment” for AMD’s Radeon datacenter GPUs and time will tell if that’s the case or not. The market does want more players and given AMD’s track record overall in the datacenter with CPUs and GPUs, the company seems like a natural. I’d like to see the datacenter GPU team focus on what AMD founder Jerry Sanders used to call a “clean kill” or those areas where someone would be dumb to buy anything except AMD. After staking a dominant presence in those “clean kill” markets, AMD has the permission to stake out its next datacenter GPU markets.
Zen 2 details
AMD also took the opportunity to provide the first details of its forthcoming “Zen 2” x86 processor architecture. AMD says with Zen2 it is adopting a “chiplet” approach—linking separate pieces of silicon together via improved Infinity Fabric inside a single processor package. Zen 2 utilizes AMD’s new 7nm technology for the CPU cores and 14nm for input and output. AMD says this should result in higher performance and lower power consumption, which should improve yields. This chiplet approach is the direction the entire industry is going, so take note. Many companies are publicly staking their public futures on chiplets, and it’as nice to see AMD delivering the first phases of it. In the future, I expect the industry to adopt even more aggressive heterogeneous chiplets, with third-party IP using 2.5D and 3D packaging.
AMD’s Mark Papermaster makes some aggressive commits on Zen 2. PATRICK MOORHEAD
AMD says Zen 2 builds on the original Zen with an improved execution pipeline, front-end advances, and advanced security features. Additionally, it says the added 7nm benefits supersized Zen 2’s floating point units, which should make a big difference in FPU intensive HPC applications. I would have loved to have seen some big IPC commitments like it made with the original Zen architecture, but I think I/O, bandwidth and power are the foci, not IPC. AMD’s Mark Papermaster did commit to “2X the throughput”, “doubled FPU”, “doubled core density”, and “half the energy per operation.” The bigger performance commitments were made with Epyc using Zen 2 below.
Updated Epyc based on Zen 2 architecture on 7nm
AMD also provided new details on the next generation of its EPYC processors, code-named “Rome,” which includes up to 64 of the new Zen 2 cores. Rome will be the first x86 7nm CPU to hit the market. With details I have right now, I am equating TSMC 7nm with Intel 10nm, so don’t be confused with the “smaller is better” numerology- it isn’t always the case.
AMD datacenter lead Forrest Norrod does a deserved Epyc victory lap. PATRICK MOORHEAD
Rome sports the industry’s first PCIe 4.0 capable x86 server bus (IBM POWER has), and is claiming twice the compute performance per socket and four times the amount of floating performance as the current EPYC processors. Additionally, Rome features increased instructions-per-cycle and bandwidth for compute, I/O, and memory. Lastly, and importantly, AMD says Rome is compatible with today’s EPYC server platforms by using the same socket. It’s unclear the performance hit you’ll take by not using a new socket design, but AMD seemed very confident in the same socket performance. I’m sure AMD is holding back details to surprise us down the road.
AMD’s Lisa Su commits to 7nm Epyc delivering 2X performance and 4X FPU performance per socket. PATRICK MOORHEAD
Of big interest to me is what happens when you combine Instinct with the new Epyc. PCIe 4 is a nice added, but there could be more. While AMD talked about how Infinity Fabric could be used to share memory across Instinct GPUs, AMD didn’t talk about what happened if you could extend the GPU via Infinity Fabric to system memory, providing complete system memory coherence. Maybe I am just inventing something in this article that doesn’t exist, but maybe AMD has finally adopted a systems approach using GPU, CPU and memory fabric. AMD’s datacenter group likes to slowly dribble out information for a rolling thunder approach, but maybe we’ll see this at the next Epyc event.
The AMD Next Horizon event did not disappoint—we got a deeper look at the future of EPYC, the Zen architecture, and even Infinity Fabric. We also saw more details on AMD’s newly released 7nm Instinct GPU cards. Additionally, the new EPYC-based AWS instances offer strong proof of these processors’ value proposition and will likely make a positive financial impact. I believe AMD has its best chance ever to capitalize on its datacenter server opportunity with CPUs and GPUs and over the next few weeks, and the team and I will be doing some deeper dives into what was announced.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.