Intel held a full day event last week to lay out its strategy and products for growth in the datacenter. Senior executives took the opportunity to talk about CPUs, ASICs, FPGAs, memory, and networking, while sprinkling in a healthy dose of real-world customer success stories. Investors were hoping that this event would shed some light on how the company plans to compete with NVIDIA for AI and with AMD for datacenter GPUs. In addition to a few product announcements, we learned more about the company’s AI strategy (which I initially outlined here back in May after its inaugural AI DevCon event). My colleague Patrick Moorhead, President of Moor Insights & Strategy, covered the broader data center topics here in this blog, but here’s my take on the AI side of things.
What's new for Intel AI?
Naveen Rao, Intel’s SVP for AI, articulated the company’s strategy for AI: to provide the full range of general purpose and specialized devices for AI, supported by a unified suite of optimizing development software. As Mr. Rao correctly pointed out, running AI apps is not a market where one size fits all, and Intel has a broad range of performance, latency, and power envelopes tailored for a wide variety of AI processing. This portfolio includes Xeon for the data center, Movidius for embedded vision, MobileEye for automotive, and Altera FPGAs for edge and datacenter inference. The missing piece of the puzzle remains the Nervana ASIC for training, which should arrive next year.
Intel started out by sharing its internal analysis of its current AI business, saying it sold an estimated $1 billion into AI in 2017, in the datacenter. It is not clear what that counts (inference on AWS, CPUs attached to NVIDIA GPUs, etc.) but it sounds reasonable to me. Intel also updated the projected market size (TAM) for AI silicon to $10 billion in 2022 (up from their previous estimates of $8 billion). The company also adjusted its TAM estimate for all datacenter silicon (servers, accelerators, memory, networking, and storage) to $200 billion by 2022 (up from $160 billion). That’s a big jump and it reflects Intel’s bullishness on these new segments’ growth rates. Intel announced at the event that it will release the Cascade Lake Xeon update for AI this fall. The update adds a feature called “DL Boost” to its AVX512 vector processor, enabling support for int8 (8-bit integer) math operations. This will help inference processing by a factor of 11, according to the company. In another update due in 2019, called Cooper Lake, DL Boost will get bfloat16 data and AVX512 instructions to help training performance. Bfloat16, from Google TensorFlow, is simply a float32 whose mantissa is truncated to 7-bits, keeping 8 bits for the exponent. This means these numbers have the same dynamic range as 32-bit floating point numbers, unlike IEEE’s float16. Clearly, Intel is trying to keep Google happy by supporting this fast-emerging requirement. The addition of bfloat16 will help those who use CPUs for training neural networks, a process which demands larger memory size than what’s available on CPUs and TPUs (albeit today that’s a pretty small market). Intel’s Nervana’s Neural Network Processor, called the NNP L-2000, will go after the larger training market when it launches in 2019. This may be Intel’s first production chip to challenge the NVIDIA GPUs, which are currently the gold standard for datacenter training. The stakes are high, and Intel needs to get this chip right after shelving the highly anticipated first-generation Nervana product.