Artificial intelligence, machine and deep learning are some of the hottest areas in all of high-tech today. We’ve had a few generations of AI over the last 50 years, but in 2010, IBM kicked off the latest cycle with Watson, using brute-force, Big Data techniques to win jeopardy. The University of Toronto in 2012 pioneered Imagenet using deep learning to identify pictures. NVIDIA then began to drive the GPU-accelerated training technology of deep neural nets, and in the course of that, huge service providers opened up and announced initiatives beginning with Microsoft, Google, Apple, Samsung, and then Amazon. Chinese giants Baidu, Alibaba and Tencent are of course, involved. Intel recently held an entire AI day outlining their strategy and announcing their roadmap. Both Qualcomm and Xilinx are active as well.
“Tight call” for Tesla self-driving cars electronics
Given all this activity and Advanced Micro Devices’ stature in GPUs, vital to leading-edge DNN training, everyone was wondering if and when AMD would enter the market. The public got a sneak peek of AMD’s involvement in a conference call with Tesla founder and CEO Elon Musk, where he remarked that AMD was a “tight call” between AMD and NVIDIA for the self-driving electronics. My jaw dropped as I knew while AMD could enter the market if they invested the resources, but didn’t think they would add another thing to their plate. Today, they added DNN accelerators and software to their plate, and it’s all upside for them from a market perspective.
Radeon Instinct accelerator cards
Advanced Micro Devices announced three accelerators, two for inference, and one for training. They are branding these “Radeon Instinct”, representing just what machine learning is supposed to do, which is get machines to have human instincts, the pinnacle of AI.
Here is the Radeon card lineup announced today, which AMD expects to be available in 1H 2017, meaning Q2 2017:
- Instinct MI6: For inference, based on the shipping Polaris architecture, delivering 5.7 TFLOPS (FP16 per AMD), in a full-length board form factor, pulling 150 watts.
- Instinct MI8: For inference, based on the shipping Fiji architecture with HBM, delivering 8.2 TFLOPS (FP16 per AMD), in a short board form factor, pulling less than 175 watts.
- Instinct MI25: For training, based on the upcoming Vega architecture, performance TBD, in a full-length board form-factor, pulling less than 300 watts.
Radeon Instinct platforms
Customers can buy the cards and plug into their desired platforms that support the thermals and power or they can buy platforms from Inventec and Supermicro who showed their support at AMD’s secret AI event last week.
- SuperMicro “SuperServer: 1U rack-mount server, dual-socket Xeon E5 -2600 v4, 3 full-length Radeon Instinct cards
- Inventec K888 G3: 2U rack-mount server, dual socket Xeon E5- 2600 v3, 4 full-length MI25 Radeon Instinct cards, up to 100 TFLOPS (FP16 per AMD)
- “Falconwitch”: disaggregated PCI-E design with support for 16 full-length MI25 cards, up to 400 TFLOPS (FP16 per AMD)
- Inventec Rack: 39U rack-design, 6x Falconwitch switches, 2xToR (top of rack switch), 120 full-length MI25 cards, up to 3,000 TFLOPS (FP16 per AMD)
I was glad to see AMD’s ODM platform partners at their event in support of Instinct as it gave me more comfort that this was more baked than a paper announcement. AMD also teased their Zen-based “Naples” server platform by saying it is “optimized for GPU and accelerator throughput computing”. This wasn’t the Naples show, it was for Instinct, but I’m sure we will be hearing a lot more about Instinct platform optimizations with Naples. AMD also reinforced at the event that in being a founding member of CCIX, Gen-Z and OpenCAPI they’re working towards a future 25 Gbit/s phi-enabled accelerator and rack-level interconnects for Radeon Instinct.
Radeon Instinct software
Over the past decade, hardware has rarely been a challenge for Radeon. The past few years, AMD has immensely stepped up their software capabilities and pivoted to an open source model. At the Instinct launch, we got that and a lot more where AMD announced MIOpen and ROCm deep learning frameworks.
MIOpen is a free and open-source, GPU-accelerated library for Radeon Instinct to enable machine intelligence implementations and is planned to be available in Q1 2017. It supports convolution, pooling, activation functions, normalization and tensor formats.
AMD also introduced ROCm deep learning frameworks optimized for Instinct. They plan on supporting popular frameworks like Caffe, Torch, and Tensorflow. These sit on top of MIOPen and the base-level hardware accelerated ROCm platform, which can remove some of the complexity of hand tuning for the hardware. We wrote about the ROCm platform here.
I need do some deep-dives on how exactly this compares and contrasts to CUDA, but I believe at a minimum, AMD is taking responsibility for the entire stack. And that’s a good thing as that’s what most customers want.
With Radeon Instinct, Advanced Micro Devices has formally entered the hot, deep neural network market with Fiji, Polaris and Vega architectures, attacking both training and inference with platforms enabling performance from 5.7 TFLOPS to 3 PetaFLOPS with 16 bit precision (per AMD). The company also announced a full stack of free and open software, a far cry from the days of a bag of parts. AMD appears to really have their act together here more than any other time.
In the end, it’s the partners who will need to speak their praises for Instinct with testimonials and their pocket-books. Both Google and Alibaba have publicly gone on-record here and here with AMD Radeon design wins in virtualized graphics, and while these are not DNN, these are for the data center, which is a positive step forward. I’m looking forward to the first AI cloud giant who steps up and talks about their AMD Radeon Instinct implementation as this typically precedes a roll-out.