Datacenters, especially the really big guys known as the Super 7 (Alibaba, Amazon, Baidu, Facebook, Google, Microsoft and Tencent), are experiencing significant growth in key workloads that require more performance than can squeezed out of even the fastest CPUs. Applications such as Deep Neural Networks (DNN) for Artificial Intelligences (AIs), complex data analytics, 4K live streaming video and advanced networking and security features are increasingly being offloaded to super-fast accelerators that can provide 10X or more the performance of a CPU. NVIDIA GPUs in particular have benefited enormously from the training portion of machine learning, reporting a 193% Y/Y last quarter in their datacenter segment, which is now approaching a $1B run-rate business.
But GPU’s aren’t the only acceleration game in town. Microsoft has recently announced that Field Programmable Gate Array (FPGA) accelerators have become pervasive in their datacenters. Soon after, Xilinx announced that Baidu is using their devices for acceleration of machine learning applied to speech processing and autonomous vehicles. And Xilinx announced last month a ‘reconfigurable acceleration stack’ that reduces the time to market for FPGA solutions with libraries, tools, frameworks and OpenStack support for several datacenter workloads. Now Amazon has announced the addition of Xilinx FPGAs to their cloud services, signaling that the company may be seeing market demand for access to these once-obscure style of chips for parallel processing. This announcement may be a significant milestone for FPGAs in general, and Xilinx in particular.
What did Amazon Announce?
Amazon is not the first company to offer FPGA cloud services, but they are one of the largest. Microsoft uses them internally but does not yet offer them as a service to their Azure customers. Amazon, on the other hand, built custom servers to enable them to offer new public F1 Elastic Cloud instances supporting up to eight 16nm Xilinx Ultrascale+ FPGAs per instance. Initially offered as a developer’s platform, these instances can target the experienced FPGA community. Amazon did not discuss the availability of high-level tools such as OpenCL or the Xilinx reconfigurable acceleration stack. Adding these capabilities could open up a larger market for early adopters and developers. However, I would expect Amazon to expand their offering in the future, otherwise I doubt they would have gone to all the expense and effort to design and build their own customized, scalable servers.
Why this announcement may be significant
First and foremost, this deal with the world’s largest cloud provider is a major design win for Xilinx over their archrival Altera, acquired last year by Intel, as Altera was named as Microsoft’s supplier for their FPGA enhanced servers. At the time of the Altera acquisition, Intel had predicted that over one third of cloud compute nodes would deploy FPGA accelerators by 2020. Now it looks like Xilinx is poised to benefit from the market’s expected growth, in part since Xilinx appears to enjoy at least a year lead in manufacturing technology over Altera with Xilinx’s new 16nm FinFET generation silicon, which is now shipping in volume production. Xilinx has also focused on providing highly scalable solutions, with support for PCIe and other capabilities such as the CCIX interconnect. Altera, on the other hand, has been focusing on integration into Intel, including the development of an integrated multichip module pairing up one low-end FPGA with a Xeon processor. Surely, Intel wants to drag as much Xeon revenue along with each FPGA as possible. While this approach has distinct advantages for some lower end applications (primarily through faster communications and lower costs), it is not ideal for applications requiring accelerator pooling, where multiple accelerators are attached to a single CPU.
Second, as I mentioned above, Amazon didn’t just throw a bunch of FPGA cards into PCIe servers and call it a day; they designed a custom server with a fabric of pooled accelerators that interconnects up to 8 FPGAs. This allows the chips to share memory and improves bandwidth and latency for inter-chip communication. That tells us that Amazon may be seeing customer demand for significant scaling for applications such as inference engines for Deep Learning and other workloads.
Finally, Amazon must be seeing demand from developers across a broader market than the typical suspects on the list of the Super 7. After all, those massive companies possess the bench strength and wherewithal to buy and build their own FPGA equipped servers and would be unlikely to come to their competitor for services. Amazon named an impressive list of companies endorsing the new F1 instance, spanning a surprising breadth of applications and workloads.
Where do we go from here?
The growing market for datacenter accelerators will be large enough to lift a lot of boats, not just GPUs, and Xilinx appears to be well positioned to benefit from this trend. It will now be important to see more specific customer examples and quantified benefits in order to gauge whether the FPGA is going mainstream or remains a relatively small niche. We also hope to see more support from Amazon for the toolsets needed to make these fast chips easier to use by a larger market. This includes support for application developers to use their framework of choice (e.g, Caffe, FFMPEG) with a simple compile option to target the FPGA, a goal of the recently introduced Xilinx acceleration stack.