NVIDIA Is Coming For Your Data Center

By Steve McDowell - April 12, 2019

Article by Steve McDowell.

NVIDIA’s GPU Technology Conference (GTC) took place a few weeks ago near the company’s headquarters in San Jose, California. Be careful how you say “GPU,” though—NVIDIA’s founder and CEO Jensen Huang was clear that it’s a term he avoids, preferring to use product names.  NVIDIA’s processing technology, you see, is about much more than just graphics.

“NVIDIA is a data center company,” the usually hyperkinetic Jensen Huang said, very matter-of-factly to a room full of industry analysts. He went on to say that NVIDIA is “focusing on the big problems of data center scale computing,” and “delivering an accelerated computing platform.”

Jensen’s vision aligns with where the world is going. Enterprise workloads are increasingly being enabled by AI technologies, and corporate data is being mined for insights by data center machine learning stacks. Edge compute and the rise of 5G is going to accelerate the need to deliver real-time analytics and insights, enabled by exactly the kinds of technologies that NVIDIA delivers.

Artificial intelligence, whether it is client-side inference, or driven by deep machine learning technologies, is the future of compute. It’s a future that runs on specialized hardware enabled by a consistent software stack.

Heading into the datacenter

NVIDIA built a machine learning supercomputer that it calls DGX. Jensen told the industry analysts at GTC that he really didn’t want to build the DGX, explaining that it was a big expensive project that the tier one OEMs would have been better at. The problem was he couldn’t find an OEM who shared his vision—so he built it himself. Then the OEMs came calling.

Machine learning in the data center requires a different way of thinking about data and storage. Machine learning pipelines are hungry beasts that need to be fed. NVIDIA recognized this and partnered with innovative partners in the storage space to bring a hyper-converged machine learning solution to market, ready-made for the various OEMs’ channel partners.

Pure Storage last year introduced its AIRI platform, coupling its high-performance FlashArray with the DGX. IBM  recently released its Spectrum Storage for AI, marrying IBM storage with NVIDIA’s DGX series. Even Dell EMC partnered with the company, and very recently announced a coupling of NVIDIA’s DGX with the Dell EMC Isilon. There are no shortage of solutions.

NVIDIA’s DGX is a stellar offering, but it’s a very specialized piece of gear. In order to truly penetrate the data center, it’s critical that the server OEM world is engaged and building products around the solution. To that end, at GTC, NVIDIA introduced a number of server designs, all of which are being built by the server OEM community. Dell EMC, Hewlett Packard Enterprise , Inspur, Fujitsu, Sugon, and Lenovo are all building workstations and servers based around NVIDIA specifications and validation suites.

This is a critical step forward, not just for NVIDIA and the server supplier world, but for enterprise IT. The strength of these relationships allows enterprise buyers to trust the solutions that they are deploying for AI and ML workloads.

It's all about that software

NVIDIA’s machine learning success comes from the intelligent choices the company makes in enabling the software ecosystem. I don’t know whether AI and machine learning were always a part of NVIDIA’s vision when it delivered CUDA to the world over a decade ago, or if it were a happy accident. It’s likely that the company’s GPUs were simply well-suited to solve machine learning problems, and CUDA gave the computer scientists the right sets of tools to exploit the raw capabilities.

It’s hard to remember that far back, but NVIDIA’s only real competition when it delivered CUDA was ATI, who had just been acquired by AMD . AMD seriously thought about joining NVIDIA in enabling a CUDA ecosystem (I know this because I was on the AMD corporate strategy team at the time), but instead made the fateful decision to place its bets on OpenCL.

The machine learning world rallied around the CUDA ecosystem, allowing NVIDIA to dominate that space today. Not resting on that dominance, NVIDIA significantly stepped up its development tools offerings at GTC. NVIDIA released CUDA-X, which packages a number of enterprise-ready libraries with the CUDA ecosystem, for easy deployment in most of the popular machine learning frameworks.

Container vision

NVIDIA also enhanced RAPIDS. RAPIDS is a suite of open source software libraries for data science and analytics pipelines that are accelerated by CUDA. RAPIDS is well-named, as it quickly enables enterprise level analytics.

Beyond CUDA and RAPIDS, NVIDIA is driving a vision of container-based workflows—where each step of an AI workflow is in a container that’s positioned where it needs to be in the data center. It’s not an accident that NVIDIA scooped Mellanox up from a pending Intel Corporation acquisition. NVIDIA sees flexible workload deployment as the optimal architecture for complex real-time AI and ML/DL workflows, a vision that relies on fast and reliable interconnects.

Beyond tools: jumpstarting solutions

It’s a simple matter to deliver tools to the industry and hope that the tools are used to build something great. It’s an order-of-magnitude greater to deliver entire pieces of a long-term vision in order to jump-start the future. NVIDIA has long been associated with autonomous cars. Its DRIVE program is a targeted set of technologies to help autonomous vehicle companies jumpstart their development. GTC saw the introduction of a slew of new capabilities for DRIVE, including a hardware platform, new software tools, and DRIVE Constellation which provides autonomous vehicle simulation. The icing on the cake for NVIDIA’s DRIVE announcements was news of its collaboration with Toyota. Toyota will be leveraging NVIDIA’s DRIVE tools as it moves its autonomous vehicle program forward.

Autonomous cars are interesting, but more world-changing is the potential of deep learning in the medical industry. NVIDIA Clara is a set of targeted software tools, including training sets, that enable developers to build medical imaging workflows using NVIDIA CUDA-enabled processors to deliver AI-enabled medical diagnostics.

These are just a few examples of how NVIDIA is going deep with software to enable machine learning. There were also announcements and activities around robotics, embedded ML, real-time ray tracing, accelerated computing, design and visualization, and more.

Concluding thoughts

NVIDIA’s greatest contribution to machine learning and AI is not its GPUs, or even its software tools. NVIDIA has positioned itself as a lighthouse, illuminating its vision for how ML-driven AI can revolutionize every industry. Yes, build your solution with us, NVIDIA says, but let us show you the power of where it will all lead.

NVIDIA preaches a vision where AI and ML look at our world, interpret it, and help us make better sense of it all. This is true across the spectrum, from technologically interesting applications such as robotics and autonomous vehicles, to business-enhancing services such as autonomous customer relations. This all leads to humanity-impacting intelligent systems such as those that can look at your medical image and help you live a longer and healthier life.

NVIDIA is both ambitious and successful. I like where the company going, and how it’s taking us there. NVIDIA could easily relax and enjoy its near monopoly in machine learning, but instead Jensen has decided to push hard for an ambitious future. We’re all just getting started in this space, and I’m happy with what NVIDIA is doing to move us forward.

Steve McDowell is a Moor Insights & Strategy Senior Analyst covering storage technologies.

+ posts