Microsoft has announced more details about their use of Field Programmable Gate Arrays (FPGAs) to accelerate servers in their massive datacenters. CEO Satya Nadella made the announcements at their Ignite Conference in Atlanta (which MI&S colleague Patrick Moorhead attended), sharing details about their five-year journey called “Project Catapult.” The surprise was not that they are using FPGAs; Microsoft had disclosed their adoption of FPGAs to accelerate BING search ranking over three years ago. What surprised many industry observers was the extent to which they are already deploying this typically esoteric style of chip and their plans for pervasive use of the technology in the future. Mr. Nadella said that the entire fleet of servers for the Azure cloud now has at least one FPGA installed in each server, delivering over one “exa op” (one billion billion operations per second) of total throughput across datacenters in 15 countries. An “ExaScale” computer in traditional HPC sense (meaning double precision math) is not expected to appear until early in the next decade.
Microsoft CEO Satya Nadella at the Ignite Conference (Source: Microsoft)
If Microsoft’s penchant for FPGA acceleration spreads to others in the exclusive club of the Super Seven largest datacenter operators (Alibaba, Amazon.com, Baidu, Facebook, Google, Microsoft and TenCent), the impact on the chip industry, notably Intel (Altera), NVIDIA and Xilinx, could be substantial. Certainly, Intel’s $16.7B acquisition of FPGA leader Altera now appears to have been prescient; at the time of the acquisition they predicted that 30% of servers would require FPGA acceleration by 2020. However, aficionados of this technology have cried wolf many times before in predicting that this difficult-to-program technology is about to cross the chasm and become a mainstream force in the industry.
What’s an FPGA and why does Microsoft care?
The real key to Microsoft’s heart is not just performance or power consumption. Microsoft points to the flexibility that FPGAs afford due to their inherent programmability. FPGAs, like GPUs, can be used to accelerate specific codes that lend themselves well to being executed in parallel, a critical and common feature of applications such as Deep (Machine) Learning. But the “P” in FPGA means programmable, and therein may lay their most important value to Microsoft and in the datacenter in general. Once programmed, the FPGA hardware itself can be changed (reprogrammed) in the field (hence the “F”) to enable it to evolve with changes in the company’s business, science and underlying logic. Microsoft says that they update the programming frequently, as often as every month.
As a result, Microsoft now sees FPGAs as an essential extension to nearly every server, accelerating a wide variety of demanding workloads in a world dominated by voice and image data. While many datacenters are now beginning to use GPUs to accelerate Machine Learning and other applications, helping to drive 110% growth in NVIDIA’s datacenter segment last quarter, the use of FPGAs have been predominantly confined to developing other chips and accelerating niche workloads such as networking, deep packet inspection, video transcoding, image processing and data compression. But these once-rare workloads are rapidly becoming mainstream as we all begin using voice and images instead of keyboards and mice. And the use of neural networks using these data types is exploding.
Note that using FPGAs is not without its challenges. Namely, the difficult task of programming these chips is often likened to rocket science, mastered by very few people who posses both hardware and software skills. But Microsoft says this investment can be justified by the impressive performance gains and the ability to adapt to changing business needs.
However, when a specific use case requires a very large number (on the order of a million) of these specialized chips to be deployed, developers typically burn the FPGA logic into an Application-Specific Integrated Circuit, or ASIC, turning the programmable chip into an even faster and lower cost piece of fixed-function hardware. Note that this process can cost many tens of millions of dollars and take months or even years to perfect and produce. Google’s use of the Tensor Processor Unit (TPU) for the same Deep Learning inference job is a prime example. As a result, FPGAs have tended to remain a niche technology used by the brave and for the small (in terms of chip unit volume) workloads. At least that has been the case until now.
So, what’s next?
Just because Microsoft can afford to hire an army of expensive computer scientists to program FPGAs doesn’t mean that FPGAs will take over the world. As I mentioned, when a chip is needed in sufficient volume, the cost of developing an ASIC can be justified, producing a much lower cost platform which can then be programmed in a higher level language. Intel’s recent acquisition of Nervana is another case in point. And of course, GPUs made by Advanced Micro Devices (AMD) and NVIDIA are excellent examples of this. Since these ASICs are “hardened” into silicon, they can be more affordable and even more efficient than an FPGA.
By being open about their innovative use of FPGAs, Microsoft undoubtedly hopes to broaden the appeal of FPGAs and increase the pool of talented engineers as well as the optimized libraries and software for the adoption of FPGAs in the datacenter. And Microsoft’s plans for FPGAs extend far and wide: beyond Deep Learning acceleration, Microsoft is using FPGAs to accelerate networking and the complex software required to implement software-defined networks.
Meanwhile, Intel Altera and Xilinx, the largest FPGA suppliers, will be cheering them on. For its part, Intel has married the cost and performance benefits of fixed hardware (Broadwell CPUs) with the flexibility of Altera FPGAs in a combined hybrid package. Xilinx on the other hand, offers the flexibility of working with all CPU architectures (ARM, POWER and possibly AMD in the future) and interconnect technologies and has been active in partnering with IBM OpenPower and the OpenCAPI effort. This story will undoubtedly continue to evolve as more use cases and software becomes widely deployed in this new world of heterogeneous computing platforms.