The conversation also covers:
- An overview on the status of CXL
- How the ecosystem is responding to CXL
- Examples of early use cases that Micron sees will benefit from CXL
- Insight into how CXL will help Large Language Models like ChatGPT
Watch the video here:
Or Listen to the full audio here:
Disclaimer: The Six Five webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Patrick Moorhead: Hi, this is Pat Morehead and we are back with The Six Five Podcast insider edition. We are talking semiconductors. We are talking CXL. And I am joined here by my incredible co-host, Daniel Newman. How are you doing, buddy?
Daniel Newman: Hey man. Good to be back. Love these insider additions. Always love talking chips. Pat, didn’t we say it one time that while all the other pundits were saying software will eat the world, what did we say?
Patrick Moorhead: We pretty much said semiconductors eat the world. What are you going to do? Run software on air, right?
Daniel Newman: Absolutely.
Patrick Moorhead: It was our snarky approach at it. But we proved the naysayers wrong and every analyst loves to do a good victory lap when they get it right like this. And we’ve gotten our fair share of these right about semiconductors, so I’m pretty happy about that. But hey, enough about us. Let’s introduce our guest, Ryan Baxter from Micron. Ryan, how you doing? Great to see you.
Ryan Baxter: Hey, great to see you guys again. Doing really well. It’s a beautiful day.
Patrick Moorhead: It really is. And it’s been great getting to know you over the past couple years and just kind of digging in and on memory and CXL. I love CXL. You know I love CXL.
Ryan Baxter: I know you do.
Daniel Newman: Hey, Pat?
Patrick Moorhead: Yes.
Daniel Newman: Do you love CXL?
Patrick Moorhead: I love CXL, yes.
Ryan Baxter: You know what it stands for?
Daniel Newman: Say one more time. I’m not going to believe you. I’ll not believe you.
As I always like to say to my kids, who are you trying to convince me or yourself? No, listen, you know that Pat and I, we love talking semis and we love talking technologies that can make applications, whether it’s more power or more efficient. We love it. Let’s start there. Let’s start first of all, we’ve had a conversation recently about HBM. Now we’re talking about CXL and the industry loves acronyms, but we also have to assume that because we have a pretty big audience of CEOs all the way to the most technical people in any organization, we like to make sure that the vernacular is clear. So give us the quick, what is CXL and give us then a bit of an update on the status.
Ryan Baxter: Of course, CXL stands for Compute Express Link. This is a brand new interface primarily used in servers that is high speed and low latency and really enables some interesting new use cases that typically were either impossible or very difficult to enable with, call it, closed interfaces. This, by the way, is an interface that’s broadly embraced and standard. So this is actually can be monetized by the masses and we’re really excited about what it could mean for future server architectures, and it’s really right around the corner.
Patrick Moorhead: Can you provide an update on the status of CXL? Where are we on this map?
Ryan Baxter: Sure.
Patrick Moorhead: There’s different flavors. There are different sub flavors out there. But listen, that’s just part of establishing industry standard. We saw this with USB. We saw this with PCIE. So this is par for the course. Where are we on the map right now?
Ryan Baxter: Well, as you know, with any of those new interfaces, it’s not just about the hardware, it’s really about the ecosystem. And so the ecosystem’s really embracing CXL and that goes all the way from the CPU vendors to the ASIC vendors that are developing specialized silicon that essentially interface with Computer Express Link software vendors and even customers. We’ve got to think differently and CXL enables us to do that. We’ve got now multiple basic vendors actually sampling silicon today, believe it or not. And so-
Patrick Moorhead: Cool.
Ryan Baxter: This silicon combined with D-RAM can enable memory bandwidth expansion as well as capacity expansion. And for the first time, we’re actually able to actually test what we conjectured through simulation perhaps a year ago. We’re actually able to actually measure in silicon today, and we’re actually seeing some very, very compelling results and value driven by CXL. Of course, we saw the next version of the CXL 3.0 specification published late last year.
Of course, that enables even higher speed in the way of PCI Gen six, as well as fabric attached types of topologies that the people might imagine exists in a server, in a highly connected fabric based server. Finally, I would say that the standards are coming along extraordinarily well. Of course you can have the best whizzbang technology, but if it’s proprietary and single source, it’s not going to be widely adopted. So the CXL consortium as well as Jet has been hard at work to enable certain aspects of standardization, which I think can really help to break this market open and be able to enable it for the masses. So that’s, in a nutshell, what’s happening with CXL.
Daniel Newman: All right, so love to hear the background. Ecosystems always important. That’s just really how this whole industry works, but use cases matter. How are people putting this to use? Give us the early use cases that you’re seeing that are benefiting from CXL technology.
Ryan Baxter: Good question. We have been looking at the use case of primarily memory capacity expansion, and this really shows up in the way of customers wanting to use a whole lot more memory in a cost-effective way than typically what a non CXL enabled server can support. Now, there’s ways around that, but they are very costly. These are methods that use a technology called through silicon vias, which are very interesting, but are pretty tough to scale in terms of cost. And what CXL provides is a nice pressure release valve in the way of a significant TCO benefit to get that memory capacity expansion, not in the main memory footprint of a server, but rather sitting alongside that main memory footprint, and that’s off of that CXL bus. So you can think of very, very large in-memory database types of applications that can benefit from that larger, more cost-effective memory footprint.
Of course, bandwidth expansion, a single by eight channel of cxl provides roughly the same amount of memory bandwidth as a single DDR five channel. So obviously for bandwidth starved applications, that becomes extremely interesting. And then the capacity expansion is just icing on the cake from that perspective. We are also seeing this concept of, again, it’s connected to the first thing I talked about, but this TSV mitigation really providing a much more cost-effective way to add a whole lot more capacity to say a cluster of servers.
And that can be in the form of, say, adding memory footprint for a single CPU. It could also be in the form of adding, say, a pool of memory so that multiple CPUs can access that pool of memory. That becomes very important when you’re thinking about applications that are increasingly hungry for additional high performance capacity. Now, you can always use other tiers of the memory storage hierarchy, but of course you’re paying a performance penalty by heading out to an SSD for your data. So CXL really does provide that high performance use case, shot in the arm, if you will, to some of the more interesting applications.
Patrick Moorhead: I want to do the double click inside a topic that Dan and I can’t go an hour without discussing. And those are things like natural language models like Chat GPT. It is amazing all of the talk, and this one’s real, right?
Ryan Baxter: Yes.
Patrick Moorhead: This one’s real and believe that it’s going to last a long time to provide incredible amount of benefit to consumers and businesses, but hardware players need to step up and support it in more efficient ways. It takes 10 x the amount of resources to train a model, and it takes a lot more resources to infer one of these models. How does CXL intersect with this? How does it benefit in making either blasting out more performance or doing it a lot more efficiently at a lower power?
Ryan Baxter: It’s a great question. It’s top of mind for a lot of folks these days. I think ultimately what CXL enables in this space is optionality. You don’t have to bring a sledgehammer to the job anymore. You can pick your tool to bring that. To things like generative AI and large language models. And so what CXL provides is that flexibility. It’s flexibility in the form of enabling things like switches and fabrics to really leverage data flow, efficient data flow and efficient connection of compute elements across fabrics rather than pushing everything to a single spot in the data center. You might be able to imagine a situation where with CXL enabled fabrics, you can solve that same problem in a much more distributed, probably more power efficient way. It also provides significantly better scale when it comes to housing some of those large language models.
They’re large and getting larger. These are models that have billions if not hundreds of billions of parameters and growing over time. Some of the more interesting problems no longer fit inside the memory footprint of a standard server. So being able to leverage CXL for that additional capacity expansion that you need to house that model, as I said, a very high performance, low latency media is going to be critical, I think. And really able to break open the democratization of AI and really allow system designers architectures quite a bit more flexibility to think about how else to solve this problem other than the traditional ways we’ve been able to do it thus far.
Daniel Newman: So Ryan, just quick follow up on that, because I like talking about large language models a lot. Do you think this could be the breakout technology? Because I’ve talked to a lot of press, I’ve been asked about CXL a lot. And it’s like it caught on and it went away. And I’m not saying it never went away for you, I’m sure, but you look at how it gets hot as a stage topic at tech events, but with all the onset of this AI stuff, do you think this is going to be maybe what really puts CXL back into the limelight again?
Ryan Baxter: Yes, that could very well. Again, it offers some really interesting capability that really is desperately needed prior to even some of the buzz around generative AI. I think this could even again be kind of that shot of adrenaline to be able to accelerate the potential to solve new problems much, much faster than you otherwise would’ve been able to do. Again, this is an open consortium, a group of companies that have embraced this technology and can innovate at their own pace. And I think the collection of that innovation across the entire ecosystem is really what’s going to accelerate exactly what you just said. It’s providing a lot more benefit to a lot more corners of the community to enable folks to solve all sorts of problems, including generative AI problems.
Patrick Moorhead: Ryan, we started the conversation by asking the question like, what is it and where are we on this map? And I guess everybody would want argue this, there are essentially three specifications that are out there, right?
Ryan Baxter: Yes.
Patrick Moorhead: One, two, and three, and then there’s some different adders on top of that, some optionality. What is the next big step that people need to be ready for? What should they be ready for upcoming? And I’ll leave it up to you whether it’s six months, year, three months.
Ryan Baxter: Great question. There’s been a lot of work around the specifications, a lot of work, trying to understand aspects of the value proposition that CXL can bring to the table. I think the next big thing is to actually see this in silicon and in volume silicon at scale. What it could mean to an entire data center, for instance, in terms of efficient data movement, adding significant capability and really driving additional sustainability that needs to happen in this industry. And that’s going to happen really over the next, say 12 to 18 months. It’s very exciting to see this transpire and of course thinking about the next generation or the next specification version doesn’t stop. That happens in parallel. And so it’s about watching the industry leapfrog itself in terms of what’s next. And so silicon at scale I think is the next thing to watch. And then after that sharing pooling becomes extremely interesting in the CXL 3.0 timeframe. So strap in it’s going to be a fun ride.
Daniel Newman: I love strapping in. Any ride in the semiconductors space gets me excited or fast cars, rocket ships roller coasters. So you got me excited. So look, as we come to time here, any other broader thoughts, observations around the CXL space that you want to share with our audience?
Ryan Baxter: Yes. I said it at the front of the talk, I guess. CXL is real. This is not something that’s 3, 4, 5 years away anymore. We’re actually looking at actual silicon and seeing benefit from actual use cases. So number two, I would say there’s broad excitement. Of course things come and go as far as what people get excited about, but CXL is being designed in everywhere and it’s interesting and what it enables is very, very valuable in terms of TCO and performance and memory capacity expansion benefits.
Finally, I think CXL provides or enables a significant expansion of new innovation, new ways to do or solve either old problems or brand new problems that we’re never able to be solved with traditional architecture. So the industry is actually going to have to make some pretty interesting decisions because we probably can’t do everything. We have to focus and really drive the best bang for the buck when it comes to the benefit for the entire community. So just a few closing thoughts, I think that’s CXL is real and it’s coming, so I said. Get ready for it.
Patrick Moorhead: It’s exciting stuff. I love talking CXL and this is going to be one of the big drivers of data center. I know there’s a lot of benefits between now and then. That is the talk when I talk to the hyper scalers, about next generation architectures and of course there’s the discussions, the gnashing of teeth that you would expect. But that’s the case with any major tectonic shift right out there. I’m really excited about it. Ryan, great to see you buddy. Let’s do this again.
Ryan Baxter: Thanks. Will do. Take care guys. Have a good day.
Daniel Newman: All right. Thanks, Ryan. All right, everybody, there you have it. Great Six Five Insider here today talking CXL, memory, large language models, Chat GPT, we fit that in. It’s in every pod now. What the heck? But hit that subscribe button. Join us for all the episodes here of The Six Five, our weekly show or insider editions, our summit, whichever thing it is, The Six Five, we’re always here. We’ve always got you when it comes to the most important conversations in the tech space. But for this episode, from Patrick Morehead and myself, it’s time to say goodbye. We’ll see you all later.