High Performance Computing (HPC) has historically depended on numerical analysis to solve physics equations, simulating the behavior of systems from the subatomic to galactic scale. Recently, however, scientists have begun experimenting with a completely different approach. It turns out that Machine Learning (ML) models can be far more efficient and even more accurate than the time-tested, number-crunching simulations in use today. Once a Deep Neural Network (DNN) is trained, using the virtually unlimited data sets from traditional analysis and direct observation, it can predict or estimate the outcome of a simulation–without actually running it. Early results indicate that by combining ML and traditional simulation, these “synthesis models” can improve accuracy, accelerate time to solution, and significantly reduce costs. If widely adopted, this will further fuel NVIDIA ’s AI growth, since it is already the incumbent provider of accelerators to the HPC community.
The intersection of Machine Learning and HPC
Machine Learning models (which can be developed, trained, and executed on the same GPUs used to accelerate simulations), can be used to solve problems that are extremely complex. For that matter, it can do so with far fewer resources than traditional approaches. ML can be more efficient for two reasons. First, numerical analysis usually demands costly 64-bit floating-point calculations, while a trained neural network usually requires only 8-bit integer calculations. Training a DNN is certainly computationally demanding, requiring many fast GPUs and potentially trillions of calculations. Once trained, however, the DNN can be used with simple integer math. Second, the entire approach functions by finding patterns in existing data, instead of calculating the numerical equations. Consequently, early research projects have shown that ML often consumes far fewer resources to unlock problems that have historically been beyond the grasp of traditional simulation.
The benefits of Machine Learning in HPC
While ML is a relatively new feature on the HPC landscape, scientists are already applying synthesis modeling in research and seeing some impressive results. While case studies in this early phase of research are few and far between, here’s a sampling of research projects (many of which will be highlighted at the annual supercomputing confab in Denver, SC17):
- LIGO Signal Processing (NCSA) – 5000X faster
- Predicting Molecular Energetics (UFL/UNC) – 300,000X faster
- Analyzing Gravitational Lensing (SLAC Stanford) – From weeks to 10 milliseconds
- Generating Bose Einstein Condensate (UNSW): 14X faster
- Sustaining Fusion (Princeton): Improved disruption prediction from 85% accuracy to 90% accuracy
- Tracking Neutrinos (Fermilab): Improved detection rate by 33%
- Protein Ligand Scoring (University of Pittsburgh): Improved pose prediction accuracy from 52% to 70%
There are three approaches being used to apply ML to HPC problems. First, it can be used to modulate simulation or experiments between successive iterations—accelerating convergence to a stable, reliable model. Researchers working on fusion power at Lawrence Livermore National Labs have been using ML to check for divergence during simulation runs, automatically tuning parameters to keep the simulation on track. They have reported significant gains in speed using this technique.
Another approach is to enhance existing simulations to improve accuracy and reduce latencies. Here, simulation provides both a starting point and the training data for the neural networks to refine the output of the numerical model. A striking example of this approach is in high-resolution ray tracing. This traditional computationally intensive approach creates a “true” image, which is then used to train a DNN to produce additional high quality images, with far fewer resources.
Figure 1: Machine Learning can produce high quality images with far less compute resources compared to traditional ray tracing.
Finally, perhaps the most impactful use of Machine Learning in HPC is the replacement of numerical simulation models with ML-based approximations. This approach has the potential to transform HPC. However, adoption will require scientists to embrace a method that may eventually render the codes they have spent decades developing obsolete. In practice, the results can be dramatic. Scientists at University of Florida and University of North Carolina have seen the benefits in drug discovery research, where they were able to reduce compute time from minutes to microseconds—a reduction of 6 orders of magnitude. This can have a dramatic impact on the time required to screen new drug candidates, a process that can take up to 5 years using traditional CPUs.
This new approach is still in its infancy, and is somewhat controversial. However, Machine Learning researchers have demonstrated they can reduce computing resources and energy consumption by orders of magnitude, while improving accuracy and lowering latencies. Given the impressive early results, it is becoming clear that some advances in HPC may not have to wait for Exascale class systems after all—they are being realized today using Machine Learning methodologies. I expect this trend to accelerate significantly in the next few years, given the hype around AI, the funding available from governments and industry, and the extremely efficient GPU hardware now available.