Summit Supercomputer – Summit storage, racks, and cooling
Last week, IBM announced, in conjunction with the U.S. Department of Energy’s Oak Ridge National Laboratory (ORNL), some impressive AI performance numbers from the Summit supercomputer. ORNL is hailing Summit as the world’s “most powerful and smartest supercomputer” currently in existence, unseating the previous record-holder, a Chinese supercomputer named Sunway TaihuLight. Based on its self-published, mixed precision performance numbers, it is the new AI beast. Summit was designed with 4,608 of IBM’s newest generation POWER9 Systems (which I’ve written previously on here and here) and 27,648 of NVIDIA’s Volta GPUs. IBM says this is the first supercomputer to be designed expressly for the purpose of AI which wasn’t an accident, by the way- it was by design. Let’s break Summit down.
The end goal ORNL is working towards is building the world’s first exascale computer, capable of performing a billion billion calculations per second in full precision. Summit looks to be a significant milestone along that journey, boasting peak performance of 200 petaflops—that’s 200,000 trillion calculations per second in classic, full precision HPC measurements. But by applying mixed precision AI algorithms to certain scientific applications, IBM says that Summit will be able to perform more than three billion billion mixed precision calculations per second—3.3 exaops.
Housed at ORNL in Oak Ridge, Tennessee, Summit is—well, quite big. It takes up roughly the size of two tennis courts, housing 4,608 interconnected “nodes” inside cabinets the size of refrigerators. Summit will purportedly be eight times more powerful than Titan, the lab’s previous champion supercomputer. Titan, for the record, was no slouch; it was a real game-changer, as the first supercomputer to employ the tactic of including a GPU in every node. Summit takes this strategy a step further with six NVIDIA Tensor Core GPUs and two IBM POWER9 chips per node. Each node also features 1.6 terabytes of memory. Overall, Summit has 250 PB of storage and of that, 1.6TB of IBM Spectrum Scale NVME storage connected via Infiniband connected to PCIe Gen4. Yes, the Summit requires an immense amount of power—about 15 megawatts, at peak consumption. That’s to be expected with a system of this magnitude and power.
The possibilities posed by Summit’s power are fairly mind-boggling. The system can handle workloads from traditional modeling and simulation to data analytics and deep learning. According to NVIDIA, Summit already has a packed schedule, lending its power to projects in several interesting research areas: cancer research, fusion energy, and disease and addiction, to name a few. It is exciting to think about the potential here, putting the most powerful AI supercomputer ever built to work on some of humanity’s biggest unanswered questions.
It’s worth noting that the same IBM POWER9 and NVIDIA architecture that Summit utilizes is also available commercially to enterprises and cloud service providers via IBM’s Power Systems AC922 system and the entire family of POWER9-based servers. POWER9 is still in the relatively early phases of deployment, and this high-profile (and highly-impressive) use case could give IBM some added clout to entice clients with. It certainly won’t hurt IBM’s case that it has the highest performance AI supercomputer record.
With Summit, ORNL is inching ever closer to its exascale goal, which it is hoping to achieve by 2021. This is the fastest system in the world—an achievement that everyone involved should be proud of. From a geopolitical standpoint, I’m sure the U.S. DOE is thrilled to have wrestled away the title from China. It wouldn’t have been possible without IBM’s awesome POWER9 architecture and NVIDIA’s best-in-class GPUs. I’m excited to see what new possibilities a supercomputer of this power will unlock.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.