Intel hosted its inaugural “AI Day” in San Francisco Thursday to highlight the company’s strategy, products and ecosystem for the fast-growing market for chips that create Artificial Intelligence (AI). I came to the event anxious to learn whether the company would be bold enough to add a new architecture to their portfolio, namely the technology they acquired through Nervana, or whether they would stick to the CPU-centric strategy that has been the company’s foundation since its early days. But I was pleased to learn that the company has decided to add Nervana to its portfolio as a scalable accelerator, and is investing broadly to build an ecosystem for the company’s AI portfolio.
This will be the first time a major semiconductor company has developed an architecture dedicated to a specific workload, which is an indication that the company sees a very large market for AI acceleration, and I believe this may be a harbinger for future industry trends. The company also highlighted software and ecosystem initiatives for AI, including an impressive AI education program to enable the enterprise market, and software from Saffron, which Intel acquired in 2015. But for this article, I will focus on the hardware story.
What has Intel announced?
The event provided the platform for the coming-out party for Nervana, establishing its broad strategic role in the Intel portfolio and roadmap. In addition to applying IP from Nervana to the Intel portfolio of x86 chips, Intel committed to productize the Nervana Engine and Neon DNN (deep neural network) software for Intel’s chips in 2017. In an uncharacteristically aggressive move, Intel also set a goal for the future, making the bold claim that Nervana would achieve a 100-fold improvement for DNNs over today’s “best GPU” solutions by 2020.
The company disclosed a few details about the Nervana Engine, now code-named “Lake Crest”, that point to the underlying advantages they hope to exploit. First, each chip has an on-die fabric that enables strong scaling with multiple nodes per CPU interconnected at 20X the speed of PCIe Gen 3, which means up to 20 GB/S. This is critical as it provides support for “model parallelism” beyond that which you can achieve with CPU’s alone. Second, the Nervana design team came up with a novel approach for reduced precision math. Instead of implementing “half-floats” (16-bit floating point execution) to speed up the calculations in training a neural network, they invented a format they call “flex-point”, which can give the precision of floating point but with the efficiency similar to that of an integer execution unit. So a common exponent can apply to an array of (integer) numbers, thereby effectively calculating lower-precision floating point at the rate normally achieved for integer values. Finally, each node has its own memory interface to HBM2, supporting 32GB of fast memory. That’s a lot of expensive memory, so this will not be a cheap part.
The Nervana Engine chip, now called “Lake Crest”, will be a high-end part with accelerators, 3D fabric, and 32 GB of expensive HBM2 memory. (Source: Karl Freund)
In addition to the Nervana Engine, the Nervana Platform will become the umbrella brand for AI acceleration in the Intel portfolio, from Xeon to Phi to Atom to Quark. For example, the company indicated that they would build a multi-chip integrated module combining a Xeon CPU with a Nervana Engine accelerator, in the same way that they have integrated Altera FPGAs. This is a relatively easy engineering endeavor and makes business sense by hard-bundling the CPU. However, the approach limits the scalability of the solution by its 1-to-1 nature, so I would expect Intel to continue to offer standalone accelerators from which their partners can build boards with multiple interconnected modules.
But AI does not equal the Nervana Engine. Some machine learning problems require a great deal of memory, for example, and here is where Xeon Phi will be positioned as the preferred solution. The company reiterated their plans to add reduced precision math to Xeon Phi next year in their Knights Mill product for these applications. And of course the Xeon, Core, Quark processors and FPGAs will be optimized with Nervana IP and software and used for inference processing where the trained neural network is actually put to use.
Where does all this leave Intel in AI?
Until these announcements, Intel’s AI strategy has been a mixed bag of parts and frankly confusing or even conflicting product positioning. By bringing the entire story together, and telling it through the company’s CEO and a stable of SVPs and partners, they have now communicated a clear strategy, a hardware roadmap, a software portfolio and a go to market plan that will enable them to compete in this market.
However, Intel needs to execute flawlessly; after all, this event was just (good) slide-ware. We did not see any benchmarks of real silicon for Lake Crest or Knights Mill. And the GPU comparison claims were quite vague. The Go to Market partnerships and education initiatives are all in the planning stage. So there is much work to be done.
Also, keep in mind that if a little startup like Nervana can build an AI accelerator, others such as Wave Computing and GraphCore can too, not to mention NVIDIA, the market leader. In this regard, it will probably take many years for Intel to come close to matching NVIDIA’s pervasive ecosystem, and of course NVIDIA isn’t standing still or resting on their laurels.
But what this does mean is that Intel appears to now have its AI act together, has some very impressive technology and leadership, and is dead-set on not missing out again on the next big thing, which AI most certainly will become.