The Six Five team discusses NVIDIA Siggraph Announcements.
If you are interested in watching the full episode you can check it out here.
Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we ask that you do not treat us as such.
Patrick Moorhead: Let me talk about, if you’ve never heard of Siggraph before. So first of all, this event is 50 years old, okay? And it’s all about research artists, developers, filmmakers. It’s all about graphics and visualization and think workstations on the hardware side and then think in the cloud doing workstation like stuff up, up in the cloud. Now, the interesting thing about it is that NVIDIA’s biggest announcements had nothing to do with visualization at all. I scratched my head and I’m like, “Do I not understand the product called the GH200?” Which is basically a super chip, which is a combination of Grace, which is a… By the way, GH means Grace Hopper, and Grace is the name of the CPU and H is the name of the GPU. So it’s a combo arm CPU, an NVIDIA GPU. It’s all about HPC and maybe about AI. So it had nothing to do with that.
I’ll get to that at the very end on why I think Jensen got up on stage there and talked about this upgrade to the Grace Hopper 200, which essentially what it did is it added memory and it gave higher performance memory on the GPU side. Interestingly enough, they pulled back the performance of the memory on the CPU. So a couple interesting things there. I don’t feel like NVIDIA’s trying to pull a fast one. I think it was that they didn’t need that memory performance on the CPU to get what they needed overall for high performance computing and AI training and inference.
So you might also ask a question, “Well, why is it CPU? Why not AMD or Intel as we see with DGX?” Well, interestingly enough, it’s not about the performance of the CPU, it’s all about the memory footprint, right? And I also think that NVIDIA can shave a tremendous amount of cost using this arm neo verse as well. So the other thing it does is it gives a more straightforward ability to connect the CPU to the GPU over NVLink. See, AMD and Intel are not the most motivated to put ports on the CPU to talk directly to NVLink, and they could also turn it off from generation to generation.
So anyways, I just thought that was important stuff. So the new increased memory footprint goes from 96 gigs to 141 gigs and it’s faster memory, and that’s HBM3e. And Dan, you and I, have done podcasts before on high bandwidth memory and what it means and what it doesn’t mean, and this is the latest memory. The CPU memory, like I said, actually got slower, which was interesting. Now, the standard, the form factor for this goes in, which is an MGX, which is a single server, which to me led me in the direction of why would NVIDIA bring this out. And the only thing I can come up with, which is bringing out an HPC and an AI chip at a visualization show, is all about AMD and the MI300 and the value proposition that AMD came out with, right? Because AMD with the MI300 came out with this massive memory footprint.
And let me explain why that memory is important. When you get into large language models specifically, and I also think there’s a lot of other foundational models, you can keep that workload, whether you’re training or inference sitting in GPU memory, that’s really, really fast. In fact, with HBM3e, the fastest memory that you can get in that form factor. So I think this is a competitive play. I don’t know what they’re expecting AMD to come out with. They came out with that big announcement about the MI300 and massive memory footprint. Oh, by the way, them not needing to have two cards to hit that memory footprint, they only needed one. So I’m trying to figure out what that means, Daniel. Does this mean that it’s between AMD and NVIDIA with large language models in 2024? Because by the way, this new system with HBM3 comes out in Q2 2024, which is around the same time that the MI300 comes out. What do you think, Dan? What are your thoughts?
Daniel Newman: I think they’re running predictive analytics and they’re predicting the future. And if you’re the company that makes the most sophisticated AI chips on the planet, I would say it’s a high probability that you’re probably running some site of data visualization that could understand the likely path in which AMD is going to take with its MI-series and where they’re going to need to be competitive. I think NVIDIA has to look at how it can protect its moat right now and how it can give confidence. So if I was AMD right now, I’m out running around and saying, “Look at what we’re doing with memory. Look at how capable we’re going to be.” And of course, there’s a lot of debate right now because the logical way someone’s going to tear this market apart and really create some disruption is going to be price. NVIDIA’s got this absolute foothold on the price and the margin market.
Patrick Moorhead: Yeah, we’ll talk about price a little later when we talk about Groq.
Daniel Newman: Yeah, and nobody wants to obviously kill the golden goose here because you start getting… There’s a bit of market falloff that I think is available on these high-end processors that is going to be had just because of availability, because of the ability to serve the customer. Meaning NVIDIA’s had so much growth so fast, I think AMD and even Intel is going to be able to compel some customers, “Hey, come with us. We’ll give you our focus and our attention. Spend with us. We’re going to support the heck out of you, your specs. We’re going to build around you.” And I think they’re looking at that. And like I said, so if I’m NVIDIA, what I’m saying is what are the most likely areas that we have some vulnerabilities? You have companies like Intel and AMD that are historic CPUs. We all know that for the biggest workloads, it is not just GPUs, it is GPUs plus CPUs. That’s the training plus inference formula for the future.
So you’ve got all these kinds of things coming together and are basically saying, “Look, we’ve got the market. We’ve got the customers. They’re tied in and hooked into our software and our frameworks with CUDA.” Let’s make sure that they see the roadmap and they have no real reason to want to consider changing. So I think that’s what’s happening. I think it’s ambition on the front end. I think the company’s going to continue to push the envelope and force the disruptors to have to be on their toes. If you want to disrupt NVIDIA, I think the easiest way to do it is with price. But we all know because we know this from CPUs, we know this from laptops, we know this from many things in the business is a zero-sum game. So you have to decide how quickly you want to erode the profitability of the business.
So I think the early hope is to keep value high, keep margins high in the early stages because we know downward pressures will naturally occur over time. So great technical insights, Pat. Hopefully I added a little bit of flavor there in terms of what I think maybe this all means for the business.