Last year, Qualcomm teased its Cloud AI100, promising strong performance and power efficiency to enable Artificial Intelligence in cloud edge computing, autonomous vehicles and 5G infrastructure. Today, the company announced it is now sampling the platform, with volume shipments planned for the first half of 2021. This begs the question: why would a company known for low-power cell-phone chips and IP decide to enter the data center market, which is full of players who have been there for decades?
Qualcomm has long had its eye on data center solutions, viewing the sector’s attractive growth and profit margins as a potential source of increased revenue. The company’s vaunted power efficiency and performance is well-suited to scale up to data center products, and the explosion in AI presents an ideal entry point. Furthermore, Qualcomm’s 5G modems and Snapdragon application processors could help the company build complete solutions, avoiding head-on price battles with smaller companies that only have an SoC for AI. Qualcomm estimates that the SAM for AI inference processing could exceed $70B by 2025, of which some $50B is in the target market spaces for the Cloud AI100.
What did Qualcomm announce?
In addition to announcing its production schedule, Qualcomm shared technical details and some specs for the AI100 chip and the delivery platforms it intends to sell. Qualcomm plans to offer a family of devices, including a 70 TOPS M.2 edge card that only consumes 15 watts TDP, a 200 TOPS M.2 card that runs at at 25 watts and a data center-class PCIe platform that cranks out 400 TOPS at only 75W TDP, demonstrating up to eight TOPS/Watt.
Qualcomm also showed off a new 5G development kit, scheduled to become available next month. The new Cloud Edge AI Development Kit comes complete with Qualcomm’s X55 5G modem-RF system and top-of-the-line Snapdragon 865 processor. For more specs on this kit, see Figure 2.
Intel Xeon dominates the market for AI inference, but that is changing as AI models become far more complex. The industry group ai.org estimates that AI models are doubling in size every 3.5 months. This creates opportunities for companies like NVIDIA and dozens of startups to meet these computational requirements. While Qualcomm did not include recent announcements from Blaize and Tenstorrent in the landscape. That said, the Cloud AI100 looks to have a significant lead over the field, processing some 25,000 images per second on the Resnet50 model at only 75 Watts. I would point out that Resnet50 is a pretty tiny benchmark, especially when compared to natural language processing models like Google's BERT. But, still, it is fairly impressive that the Cloud AI100 M2 delivers roughly four times the performance per watt of the former king of the inference hill, the Intel Goya, on the same benchmark.
Speaking of larger and more complex models, Qualcomm clearly anticipates a need for much more memory. The chip’s 144MB of on-die SRAM is supplemented with up to 32 GB of LPDDR4 on the PCIe card. Additionally, the chip supports a range of numeric precisions, including 8- and 16-bit integer and 16- and 32-bit floating-point math.
As for the software, typically the Achilles heel for AI startups, the AI100 leverages Qualcomm’s rich mobile inference ecosystem and already offers a full stack of frameworks and optimization tools.
The Cambrian Explosion of AI chips marches on, and with this announcement, a new heavyweight contender has entered the ring. If Qualcomm’s claims of 400 TOPS prove out, this will be the fastest chip I've seen that is dedicated to inference processing. That said, we will have to wait for a formal announcement from GROQ, which indicated its offering could be in the same ballpark but at significantly higher power, and Tenstorrent, which presented some intriguing approaches to AI at the recent HotChips conference. Additionally, we are still waiting for details from companies like Intel and SambaNova. I am also very interested in seeing more performance data from the NVIDIA A100’s multi-instance GPU feature, which amortizes the cost and power of a big GPU over seven inference instances. I believe this approac holds tremendous potential in the data center.
Beyond the specs, I believe that many customers would choose to do business with an established semiconductor company, like Qualcomm, over a startup—unless the youngster can deliver dramatically better performance and efficiency. Qualcomm provides rock-solid quality, performance, efficiency, support and a complete software ecosystem for AI inference processing, born from years of experience with Snapdragon. It’s a powerful premise, and I look forward to seeing more benchmarks. Let the Explosion continue!