F1 acceleration platform, which supports an 8-node Xilinx-equipped EC2 instance to enable the FPGA acceleration application development. Figure 1: The Microsoft Brainwave mezzanine card extends each server with an Intel Altera Stratix 10 FPGA accelerator, synthesized to act as a “Soft DNN Processing Unit,” or DPU, and a fabric interconnect that enables datacenter-scale persistent neural networks. Source: Microsoft. What did Microsoft announce? While Amazon and Baidu are working to render FPGA’s more accessible and easier to program on their clouds, Microsoft is perhaps the largest end-user of FPGAs for datacenter applications, accelerating a wide swath of their massive computing infrastructure and applications on Bing and Azure. To demonstrate their resulting prowess, Microsoft unveiled Project Brainwave, a scalable acceleration platform for deep learning, which can provide real time responses for cloud-based AI services. Microsoft had previously announced some 29 of these AI APIs, lowering the barriers to adoption for enterprises looking to get on board the AI bandwagon. Now Microsoft is sharing details about the hardware infrastructure upon which these MLaaS APIs and Bing internal services are built. Microsoft’s Project Brainwave consists of three components:
- A high-performance systems architecture that pools accelerators for datacenter-wide services and scale. By linking their accelerators across a high bandwidth, low-latency fabric, Microsoft can dynamically allocate these resources to optimize their utilization while keeping latencies very low.
- A “soft” DNN processor (DPU) that is programmed, or synthesized, on 14nm class Altera FPGAs. More on this below.
- A compiler and run-time environment to support efficient deployment of trained neural network models using CNTK, Microsoft’s DNN platform. Similar to the case of Google’s TPU and TensorFlow, Microsoft requires a hardware platform that is optimized for their own Interestingly, Microsoft has claimed that CNTK can have significant performance advantages over TensorFow, especially for recurrent neural networks used for natural language processing. It is not clear the extent to which Brainwave further enhances CNTK performance.