Article by Karl Freund.
Last week at GTC 2019, Jensen Huang, the high energy and immensely entertaining CEO and founder of NVIDIA , took the stage to give his keynote to the event’s 6,000+ attendees. However, this was anything but his usual keynote. We have all been spoiled for years by NVIDIA’s dependable yearly flood of new products for graphics and AI. This year, though, Huang spent a lot of time describing a dizzying myriad of relatively small announcements. While some may have been disappointed, this shouldn’t have surprised anyone who has been paying attention. NVIDIA’s GPUs for graphics and data science applications were all refreshed over the last couple of years, and they remain clear performance leaders in their segments.
While NVIDIA’s hardware engineers have been hard at work designing their next generation chips, their software and partner engineering counterparts have been busy deepening the formidable moat that the NVIDIA ecosystem provides as a defense against newcomers. Jenson’s keynote clocked in at 2:40, and it would take hours to read and understand the myriad of announcements. If interested, his full keynote can be found here. For the sake of this column, we’ll stick to the highlights.
NVIDIA GPUs becoming ubiquitous in the datacenter
Jenson shared a slew of announcements geared towards making GPUs more widely available and consumable, in a variety of hardware and cloud platforms for running games, workstation applications, and AI. This included new servers with T4 GPUs from Cisco, Dell EMC , Fujitsu , Hewlett Packard Enterprise , Inspur, Lenovo , and Sugon. T4s are also now available as cloud instances from AWS, adding to the existing Google Beta support for T4 announced in January. This is significant because NVIDIA must successfully fight off the coming hordes of startups that are bringing inference chips to market later this year. The affordable and fast T4 is essentially a mass-market, multi-purpose Data Center GPU. It can be used for AI inference, gaming, remote workstations (VDI), ray tracing for rendering, and even AI training, according to Ian Buck, NVIDIA’s VP of Data Center products.
If you are an NVIDIA customer, this is all very good news. If you are intending to compete with NVIDIA, your job just got a lot more difficult. NVIDIA’s broad ecosystem will be difficult if not impossible to match; simply having a better chip for limited uses won’t cut it, except for a few very large markets such as vision processing (think smart surveillance cameras), autonomous vehicles, and industrial automation (think robots).
Inference isn't always easy
One of the most impressive demos Jensen shared was the MicrosoftBing conversational search engine, powered by NVIDIA GPUs. Many people think that inference is a lot easier than training, which is true; for this reason, they believe that low-cost inference chips will rule the roost, which is debatable. Simplicity is only true for relatively easy jobs, like object detection and recognition in images in a . Truly intelligent services like Bing combine many inference tasks to create a natural interface that understands what the user is really looking for. For example, the workflow could look like this:
- understanding the spoken query in a number of languages
- translating that query into text
- submitting that query to a search engine
- determining the most optimal response, perhaps in the context of a multi-query conversation
- synthesizing the spoken result, and
- displaying the top-ranked results.
According to Microsoft and Jensen Huang, it really takes a GPU to deliver both the computational complexity and programmability to process the different types of neural network each step requires (such as DNNs, CNNs, GANs, RNNs, and Reinforcement Learning).
One software platform to rule them all
Another key announcement that might be a little confusing is the new “CUDA-X” AI Ecosystem. Basically this is a rebranding of the data analytics, graph processing, Machine Learning ((RAPIDS), and Deep Learning training and inference libraries that span workstations, servers, and cloud platforms. Just like the NVIDIA GPU Cloud, this restructuring and combining of software stacks should ease deployment of compatible software componentry. From an industry perspective, CUDA-X widens and deepens the defensive moat NVIDIA enjoys today, potentially protecting NVIDIA from the dozens of AI chip startups and giants who are preparing their own devices and software for introduction later this year (see my three part series on the Cambrian Explosion of AI Chips).
Autonomous vehicles on parade
The GTC show floor provided live interaction with thousands of NVIDIA-powered devices from hundreds of vendors. The most impressive was undoubtedly a massive “TuSimple” semi-truck, based on a customized Peterbilt model. As a startup, TuSimple has attracted $178M in venture funding, and provides full Level 4 Autonomous long and short-haul trucking as a service. It is deployed on 3 to 5 delivery trips daily in Arizona, and soon in Texas. Until the event, I had not realized that anyone had a Level 4 vehicle running autonomous, revenue generating routes on public roads. The TuSimple truck has over a dozen sensors, including cameras which, along with radar and lidar, make up a perception system that “sees” a standard-setting 1000 meters ahead, providing 35 seconds to respond to hazards and obstructions. This is over three times as far as LIDAR can provide and is ahead of Google’s Waymo and Tesla.
While many were disappointed (though not surprised) that no new chips were announced, I, for one, still came away impressed. NVIDIA’s ecosystem appears unstoppable in AI, garnering support in just about every University, every cloud, every server vendor, and every Hollywood post-production studio on the planet. As its position continues to strengthen, and it gets past the Crypto-fed inventory hangover, NVIDIA remains an undisputed leader in what is about to become a far more crowded and competitive market for AI silicon.
Karl Freund is a Moor Insights & Strategy Senior Analyst for deep learning & HPC.