AWS Goes All In On Arm-Based Graviton2 Processors With EC2 6th Gen Instances

By Patrick Moorhead - December 20, 2019
AWS Graviton2 Processor

I have been in and around processors for nearly 30 years as a systems OEM, search engine provider, chipmaker, and now CEO of analysis and research company Moor Insights & Strategy. I tell anyone who will listen that semiconductors are hot, and it doesn’t take much convincing nowadays. You just need to look to the trillion-dollar (or near) club Apple, Amazon, Google, and Microsoft to see how these companies are leveraging custom silicon to enhance their competitive advantages. 

In what I believe is the biggest piece of datacenter chip news this year was announced today at AWS reInvent I am attending in Las Vegas. In preview today, AWS announced its new 6th Generation EC2 instances, led by its Arm-based Graviton2. I had analyzed many Arm-based datacenter processors, but this is different. AWS announced that its M6g instances offer nearly “40% higher performance at 20% lower cost equating to a 40% improved price/performance” than its own M5 instancesNow pick your chin up off the ground.

AWS leads in silicon diversity

I believe AWS is currently leading the charge with the diversity of its custom datacenter silicon, which includes first-gen Graviton we wrote about here, Inferentia, which we wrote about here and all its custom silicon in Nitro which accelerates management, storage, networking and security functions.  AWS’s custom silicon is added to merchant processor silicon from Advanced Micro Devices, Intel, NVIDIA, and Xilinx for what I consider the cloud’s broadest compute platform.  So, let’s dive into Graviton2.

Graviton2 specifications

Graviton2 is a custom-designed, 7nm SOC with 64 Arm Neoverse N1 cores (64KB L1/1MB L2 cache each), supporting dual-SIMD, and support for special instructions for int8 and fp16 processing. This is the first commercially deployed instance I am aware of. It’s a beast at 30B transistors, which is more similar in size to AMD’s EPYC than any other datacenter processor. The N1 cores are connected by a mesh architecture with approximately 2TB/sec bandwidth, 32MB L3 cache, and 64 lanes of PCIe gen 4.  Graviton 2 servers support 8 DDR4-3200 channels of always-encrypted memory via AES-256 with an ephemeral key and up to 1Tbit/sec of compression memory acceleration.  Folks, this is very much a “big core” with some very special features like native fp16 for ML inference and always-encrypted memory. Call me impressed.

Graviton2 performance 

It’s one thing to have an impressive chip architecture, but it’s another to deliver the performance needed for the datacenter. There’s more to being successful in the datacenter, but I’m giving AWS the benefit of the doubt on reliability and availability given AWS services are…. reliable and available. What’s different about AWS’s performance claims is that this is AWS’s performance measured in AWS’s datacenter, running AWS’s software compared to AWS’s current M5 instances in AWS’s datacenter.

Compared to first generation Graviton, AWS says Graviton2 delivers “7X the performance with 4X the compute cores and 5X faster memory.” It also offers 25Gbps networking and 18Gbps EBS bandwidth. But how about compared to its M5 instances powered by Intel Xeon Platinum 8000 series (Skylake and Cascade Lake SP-based) operating all core-turbo at a sustained 3.1GHz? 

Here is how much AWS says that its M6g instance outperforms M5:

  • >40% better integer performance on SPECint2017 Rate (estimate)
  • >20% better floating-point performance on SPECfp2017 Rate (estimate)
  • >40% better Java performance on SPECjvm2008 (estimate)
  • >20% better web serving performance on NGINX
  • >40% better performance on Memcached with lower latency and higher throughput
  • >20% better media encoding performance for uncompressed 1080p to H.264 video
  • 25% better BERT ML inference
  • >50% better EDA performance on Cadence Xcellium EDA tool

These are some serious eye-popping numbers shared by AWS and remember, these weren’t conducted in some pristine lab but in an AWS datacenter. Also remember, AWS is planning on pricing M6g 20% lower than M5. I am looking forward to customers benchmarking their apps on M6g and comparing those results to the AWS benchmarks above. I’m also interested in how Graviton2 compares to the 2nd Gen AMD EPYC processor (aka “Rome”) instances as AWS hasn’t deployed those yet.  

So, what can it run today?

So, if I’m an AWS customer, what can I run on the Arm-based M6g instances? AWS said customers could run the following:

  • OS/Environment: Amazon Linux 2; Ubuntu 16.04/18.04/18.10; RHEL 7.6/8.0; SUSE Linux Enterprise Server for Arm 15; Fedora Rawhide/Atomic; Debian 9.8; Docker Desktop Community and Docker Enterprise Engine (in beta) with “more coming soon”
  • Containers: Amazon ECS and Amazon EKS (in preview) and AWS saysthat “the majority (>70% as of today) of Docker official images hosted in Docker Hub already have support for 64-bit Arm systems along with x86.”
  • Tools: AWS Marketplace, Systems Manager, CloudWatch, CodeBuild, CodeCommit, Cloud9, CodePipeline, Inspector, Batch, CDK, CodeDeploy, CodeStar, CLI, X-Ray and Amazon Corretto (OpenJDK distribution).
  • AWS Services: Amazon ElastiCache, EMR, Elastic Load Balancing

What you see here is an impressive start for cloud native applications, but as you would expect, light on “lift and shift” enterprise environments like SAP or Oracle. VMware already has ESXi running on Graviton-based instances. That’s not to say Oracle hasn’t announced Arm support, because they have, but just not supported (yet) on AWS EC2 M6g. As for SAP? Not a peep. 

In AWS words, “If you’re using open-source software, everything you rely on most likely works on Arm systems today.” Generally speaking, if your app is currently written for X86 in a compiled language like C# or VB, customers will need to recompile their app to run on M6g. If the app is in an interpreted language like Python and JS, then recompile is not required. You can find some interesting customer and partner testimonials over on the AWS A1 web page.

Future, planned instances for 2020

Today, M6g is available for preview, but AWS isn’t stopping there. It announced that the following EC2 6th generation instances would be coming in 2020:

  •  R6g: these instances are designed for large, in-memory datasets with EBS (elastic block store) storage
  • C6g: these instances are designed for compute-intensive workloads with EBS storage
  • M6gd: these instances are M6 with instance-local NVMe storage
  • R6gd: these instances are designed for large, in-memory datasets and include NVMe storage
  • C6gd: these instances are designed for compute intensive workloads and include NVMe

Wrapping up

I was expecting a follow-on upgrade to the A1 instances announced at last year’s AWS re:Invent, maybe an “A2”, but nothing like the company announced with the AWS Graviton2. AWS is dead-serious with this new chip, and I don’t say this lightly. First off, this is a new 6th generation EC2 set of instances that starts with Graviton2, not what I saw as a “pipe-cleaner” with A1 instances announced last year. AWS doesn’t mess around when it comes to new instance generations, and it's not messing around with M6g as it literally compared head to head with the current M5 instances. 

Here's something else I find impressive. Arm released N1 IP in February of this year and AWS is launching a fully designed and manufactured part based on that IP some 10 months later. This is not what one expects to see from a silicon design, development and manufacturing cycle. Literally unheard of. Even assuming AWS got the IP early, which I'm sure they did, this is about as agile a silicon development lifecycle as one can imagine.

You may be wondering how AWS could do something this big. This isn’t the team’s first rodeo. This team started off as Annapurna Labs, acquired by Amazon in early 2015. Before its acquisition, AWS worked closely with Annapurna Labs on Nitro, AWS’s secret sauce for offloading, accelerating, and virtualizing compute, storage, and network. Each Annapurna chip included an acceleration ASIC and Arm cores, and I believe 10s of millions of these chips are powering Nitro today. 

The other thing to consider is that the AWS Graviton2 chip was architected to work inside its own AWS datacenter architecture, not the millions of variants of enterprise datacenters. It also doesn’t have to be backward compatible with 30 years of software, either, which drastically reduces test time and development. It’s equally important to understand that AWS licensed the base N1 core from Arm, which includes billions in Arm R&D and which will be is leveraged across numerous manufacturers and further limiting test time and effort. This is unlike other Arm datacenter plays that developed their own cores. This wasn’t naivety; it was that Arm wasn’t designing and licensing “big cores.”

I believe the degree of AWS Graviton2 success will be determined by AWS customer’s degree of translating the performance and cost claims made in its benchmarks above to real customer workloads. AWS has some work to do to bring Graviton2 to all its tools and services. It will also depend on the price/performance of AMD’s 2nd Gen EPYC available today from other cloud providers and price/performance/watt of Intel’s upcoming 10nm Xeon silicon due to arrive in 2H/2020. I’m wondering what the impact will be to the other cloud players. Will this drive them to go deeper with Arm and AMD? Time will tell. Also consider that while much of the discussion so far has been about IaaS. But now consider that all of AWS’s PaaS and SaaS sits on top of that IaaS which could have 40% higher performance at 20% lower cost. 

Whatever the outcome, we now have even more competition in the datacenter processor market. As I said, silicon is hot. 

Note: This blog was written with content from Moor Insights & Strategy compute analyst Matt Kimball.

Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.