Article by Karl Freund.
Qualcomm held an AI Day event this week in San Francisco to tout the company’s broad and deep Artificial Intelligence capabilities, across mobile, automobiles, IoT, cloud computing, and more. I will focus here on the company’s first foray into the data center for AI, while my colleague, Anshel Sag, will cover mobile and gaming in another Forbes article.
Qualcomm has long sought an avenue into the lucrative market for data center chips, most recently with its high-performance 64-core ARM CPU, Centriq. The CPU market is a tough nut to crack, though, given Intel’s dominant market share, AMD ’s x86 compatible EPYC server devices, and the software challenges an ARM chip must address. Qualcomm essentially abandoned Centriq last year for what it hopes is a better idea: a data center accelerator for Artificial Intelligence processing. In my view, AMD EPYC diminished the opportunity for non-x86 server CPU contenders, so Qualcomm’s move makes a world of sense. Qualcomm believes that extending its 5G chips with AI in the cloud will give the company an end-to-end AI platform to support its vision of trillions of connected things with intelligent services at the wireless edge. The company projects that by 2025 the penetration of AI in edge devices will grow from today’s 10% attach rate to 100%.
Qualcomm already possesses impressive AI technology in its flagship Snapdragon 855 mobile chip, and is now extending its footprint to cloud and edge servers. Since the market for training the neural networks used in AI is dominated by NVIDIA, Qualcomm wisely decided to focus on the emerging inference market. In inference, trained neural network models are processed to make human-like decisions, such as recognizing a person’s face, suggesting purchases, or translating natural languages. While the bulk of AI accelerators deployed in data centers over the last few years were for training, inference is widely expected to become the larger market over time—rising to perhaps $20B by 2025. Additionally, unlike in training, there is no 800-pound gorilla Qualcomm needs to unseat to become a major player in inference, although NVIDIAand Intel have placed their bets here, alongside dozens of aspiring startups.
The Qualcomm Cloud AI 100
Qualcomm has a long history of producing high-performance chips, primarily for mobile phones and embedded applications. As a result, it has a lot of experience in advanced manufacturing process nodes, signal processing, and power efficiency. To translate that heritage into a successful foray into the data center, the company had to design a much larger chip—the new Cloud AI 100. The new chip is being fabricated on 7nm technology, ahead of potential competitors, and is expected to sample in the second half of 2019.
Inference acceleration: a growing market
The emerging inference market for large-scale or complex neural networks is really wide open, and Qualcomm is one of the only large players to compete for this growing opportunity. Most inference work today in the data center is processed quite adequately on Intel Xeon CPUs, but that will change as more complex neural networks and multi-network workflows become the norm. A chip for AI inference processing in the cloud has to be dramatically better than a CPU if it is going to compete. It has to be able to handle hundreds or thousands of simultaneous queries with low latency response times.
Qualcomm promises performance and power efficiency
This begs the question, how good is the new AI 100? At the launch event in San Francisco, Qualcomm said it delivers over 50 times the performance of the Snapdragon 855, supports all the major AI software frameworks, and is at least 10 times faster than any chip it has tested. Snapdragon delivers 7 Trillion Operations Per Second, or TOPS, in processing inferences, so 50X implies 350 TOPS for the AI 100. That is significantly faster than any chip I am aware of that is currently on the market or announced. While I anxiously await real-world inference benchmarks, such as mlperf or Resnet50 images per second, this chip will certainly garner a lot of attention in the soon-to-be crowded marketplace.
Early adopters may include Facebook and Microsoft
Even more impressive than the AI 100 device’s performance is the company it attracts: both Microsoft and Facebook surprised the launch audience and joined Qualcomm on stage to share their AI use cases. This is noteworthy, since both these companies rarely, if ever, announce support for new silicon products, and both have publicly shared in-house chip development efforts for inference. Although they did not state their intentions to use the Qualcomm chip, they have to be very interested if they agreed to present at the Qualcomm AI Day event.
Facebook’s Joe Spisak took the stage to talk about the company’s pervasive use of AI. The social media leader crunches an amazing 200 trillion inference queries every day, and its data center power consumption is doubling every year. The message here is that it needs high performance, low latency, and power efficient accelerators for its growing inference workloads.
Microsoft’s Venky Veeraraghavan discussed the infusion of AI into Azure, the Windows ML Always Connected PCs on the Qualcomm SDM850, and the Hololens 2 based on the Qualcomm Snapdragon 845. The net impression is that Microsoft has a close co-development relationship with Qualcomm, and has an unquenchable thirst for AI inference processing. The unspoken assumption is that Microsoft, which heavily invests in FPGAs, will be testing the Cloud AI 100 when it becomes available.
With the AI 100 in the cloud, and the Snapdragon 855 bringing AI to some of the world’s fastest phones, Qualcomm is positioning itself for AI leadership from the edge to the cloud. While the company is clearly a leader in mobile processors and modems, until this announcement, I don’t think many people in the data center world would have associated AI with “Qualcomm. That could be about to change if Microsoft and Facebook’s launch participation indicates the potential adoption of the AI 100 by the world’s largest data centers. Qualcomm is strong in developing low power devices and leading manufacturing process nodes, so I would be surprised if many, if any, can beat it on power efficiency. The chip’s performance appears to be quite impressive—in fact, the A100 may be the fastest AI inference processor yet promised. The challenge now is to transform that raw performance potential into real-world application performance and efficiency. So far, Qualcomm’s entry into the datacenter looks impressive.
Karl Freund is a Moor Insights & Strategy Senior Analyst for deep learning & HPC