On this episode of The Six Five – On The Road, sponsored by Intel, hosts Daniel Newman and Patrick Moorhead welcome Intel’s Greg Lavender, CTO, SVP and GM of Intel SATG, and Sandra Rivera, EVP and GM Intel DCAI for a conversation on Intel’s AI Portfolio during Intel Innovation in San Jose, California.
Their discussion covers:
- What is unique about Intel’s AI strategy, and Intel’s focus on democratizing AI
- How Intel customers are using Intel Gaudi2 accelerators today
- What Intel is doing to make it easier for developers to build AI solutions that run on Intel hardware
- What enterprises need to be doing to prepare for this coming wave of AI inferences
Be sure to subscribe to The Six Five Webcast, so you never miss an episode.
Watch the video here:
Or Listen to the full audio here:
Disclaimer: The Six Five webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Patrick Moorhead: Hi, this is Pat Moorhead and the Six Five is on the road in San Jose at Intel Innovation 2023. Dan, what an event it has been so far. I mean, announcements pretty much all around the world. I mean, whether it’s a client, whether it’s a data center and pretty much everything in between. I love it. I love tech. But even better, I love useful tech that can make change not only for businesses and consumers, but society as a whole. And I feel like we’ve pretty much gotten all of that.
Daniel Newman: Absolutely edge to cloud outside of AI. We do it all here because, well, like you said, Silicon rules the world. So it’s a great event. I’m happy to be here and happy to have had some really great conversations and heard some thoughtful, introspective, and visionary commentary from many of the executives, the partners, and of course the developer community, too.
Patrick Moorhead: Yeah, it’s been an incredible run. I mean, generative AI isn’t new, didn’t start last November, but so much of that conversation was happening. And by the way, also, the dawn of generative AI doesn’t mean that analytics, machine learning, deep learning just suddenly go away. Technology is additive. If you look where we started even 40 years ago, that’s still very much in the market. There’s new things that layer on top of it that are better for certain applications, but it’s super interesting what is happening in the data center. It’s the beginning of IaaS, SaaS PaaS, and for data centers around the world. I’ve never seen this amount of excitement in years. It does remind me right at the early days of when web hit and then ecommerce hit, and then social, local, mobile, and then the cloud. So here we are on this amazing AI journey.
Daniel Newman: Yeah, there’s so much and AI is a part of it, but AI seems to be a part of all of it. So it’s probably one of the areas I would love to spend a little bit more time on and imagine if we could have some really thoughtful guests come on our show and we could talk more about what’s going on with AI.
Patrick Moorhead: No, exactly. Greg and Sandra, welcome to the Six Five.
Sandra Rivera: Thank you.
Patrick Moorhead: Great to have you.
Sandra Rivera: Good to be here.
Patrick Moorhead: Greg, it’s great to see that we had a good experience when we did the kickoff. You’re back, we didn’t scare you away. People love that video. It’s out there, got them all primed up for what we’ve been seeing over the last couple of days. So welcome back.
Greg Lavender: Thank you. Glad to be here.
Daniel Newman: Yeah, see, you both heard Patrick and I sort of layering on the AI sauce before we got started here. And, I mean, that’s kind of what it is. I mean, everybody right now is looking at the market. They’re looking at what companies are leading, what companies are following, which companies have solutions for today, which companies have solutions that are driving revenue and can be counted, which are the companies that are going to be contributing meaningfully to what’s going to come next. And it’s all about the application.
But for Intel, a lot of eyes are on… This company has been leading silicon development for the longest time. And on a global scale, I think people want to hear what Intel is going to do and how it’s going to be a big player in this AI space.
Sandra, I want to start out with you talk about that. Tell us about what’s different, unique, and exciting about Intel’s AI approach.
Sandra Rivera: Well, I want to just add to the idea that AI is layering on capability, that it is accretive in terms of the overall technology-
Daniel Newman: Yeah, love that word.
Sandra Rivera: … landscape and that frankly, we’re in the early days of AI. Sometimes I feel like everything that gets written is, “Oh, it’s done and dusted and it’s over.”
Daniel Newman: Exactly.
Sandra Rivera: But truly, we are very much in the early days. It is such a transformative technology. It will impact every single industry, every single part of our lives, how we live, how we work, how we play.
And for Intel, we are very excited to occupy what I’d say is a unique space in the fact that we are taking a very open approach. Open is in our DNA. We are just big believers that when you open up the market opportunity, lower the barriers to entry, you increase market participation, and then you accelerate the rate of innovation. And clearly, this is an area where there’s lots of innovation from the cloud to the edge and to the client, which is a lot of what you’ve heard here at Innovation.
It’s not just what’s happening in the 10,000-node GPU clusters that are dedicated for many months and many megawatts of power and many tens of millions of dollars, but it’s just the day-to-day AI that happens in the enterprise clearly where enterprises of all sizes and shapes are looking for more productivity, more efficiency, but also in your client device and the ability to have more efficiency and effectiveness of your day-to-day activities.
So Intel’s anchored on open, we’re anchored on offering a choice to a clear a market leader, and really leaning into that cloud to edge to client, which the ubiquitously of our compute through our CPUs and now adding in and layering our GPUs and our AI accelerators, it’s an exciting time for us to meet the moment and to address our customer requirements.
Patrick Moorhead: When I look at that strategy, and it took a while to form it, but what’s funny is earlier in the year, I feel like if I would’ve guessed at what Intel’s AI strategy would be, exactly that for a lot of reasons. And first of all, you play in all these areas, AI is pervasive, compute’s pervasive, AI is pervasive. And the second thing is, historically, we’ve seen that probably nine out of 10 big swings that tech has taken has benefited from higher degrees of openness. And I do think it’s just vital that we keep this open to keep on innovating.
So Greg, you’ve talked about AI everywhere and Sandra alluded to it, talking about the strategy. From your point of view, from your CTO point of view, what does that mean and why does it matter?
Greg Lavender: Well, I think I’ll talk about the software part of it. So I think we have this rich portfolio, we do have a rich portfolio of hardware technologies. And also I can say in my time at Intel, what’s so impressive is just the depth and breadth of technical experience, skills, and product portfolio is frankly unmatched in the industry along with our fabrication capabilities. And so if you look at that foundation, and also when we’re deep tech, we’re not out there talking at the highest layers of the ecosystem. We’re deep tech and that takes time to get it right and to execute it and deliver it with the right performance, power efficiency profiles, et cetera.
And that’s what Sandra and I partnered closely on with the data center side is the software and the hardware. Pat said this word, software. Software defines silicon-accelerated or silicon-enhanced. And so everything we’re doing is to get the silicon right, get the quality right, get the scale and the performance right, and then deliver the software stack, open source software, and as ubiquitously as we can from client edge to cloud data center, and see the whole ecosystem with that. And that will lead to developer productivity, which is what Intel Developer is all about, is give them great deep tech, give them great software, collaboratively through the ecosystem, and get those developers productive, doing their jobs wherever they want to do them. And we got our act together and we’re going to execute that and we’re going to deliver it to the market.
Daniel Newman: So Sandra said something really interesting, and I really liked your comment about… I’ll put it in my own, paraphrase it, but basically people are acting like the race is over. And I would argue that we’re really just first feet off the starting line when it comes to AI. And so something you said is you’re talking about open, you’re talking about democratization. And Greg, just a quick follow on to your comment is the world is I think clamoring for that, but it’s also with a lot of this early, the race has been run they, pundits, thought leaders, journalists, it’s almost made it sound like, “Oh, it’s too late.” But, I mean, you’re really sticking to that vision that, “Hey, there’s a whole wave ahead” and this open approach is critical to really enabling the power of AI, right?
Greg Lavender: It always wins in the long term, just this look. And there are obviously proprietary islands of technology that get a favorable competitive advantage for a period of time, but ultimately the market wants choice. They want diversity, they want stability, and they want quality, as well.
But that’s one of the reasons we launched Intel Developer Cloud. We sort of dreamed that up, which I think is very innovative for Intel. And we had Pat’s support for it, is essentially to basically get our latest, greatest technology out there into the Intel Developer Cloud in the hands of developers before it ever shows up in a server from an OEM or a device from an ODM or in a public cloud. So we’re committed to making sure our latest technology, particularly on the Xeon roadmap, our GPU technology, our Gaudi2s, that all that technology is widely deployed. We’re scaling it as fast as we can build it out and wired up in the data centers in our colo facilities where we’re running Intel Developer Cloud because that becomes the playground where you can come in and play for free for a little while. We’ll let you go for free for a while. You start getting value out of it, and you’re going to hear about some of that value in the presentations we have.
But basically we can do this at scale, and this is a new mental muscle for the company. We’re building large-scale systems, architectures, compute, networking, storage, software, operational stability, and operational scale. This is not something Intel has historically done. We’ve talked about rack-scale computing that other people do. We’re doing that ourselves and we’re doing it in partnership with our closest customers so that they can take advantage of it, as well. And so I think this is really the game-changer for us is we can get our products and technology to market faster so that developers can become convinced that it’s competitive, it can meet the performance, and, with our latest MLPerf numbers, exceed the performance of our competitors.
Patrick Moorhead: Speaking of the latest MLPerf numbers, it’s funny, I always knew that Intel would rack up some numbers. But like we talked about, some of these bets are five years old. Silicon is hard, especially when you have to make these huge commitments. But Gaudi2 is making the headlines and got on our radar screen for the Six Five, for our podcast, we’ve written a bunch about that. But can you talk a little bit about the latest and greatest out there? I mean, you’re racking up great scores, you’re participating in benchmarks that maybe that others haven’t.
Daniel Newman: We love talking about the fact, by the way, that accelerators will sometimes kick GPUs. You know what?
Patrick Moorhead: Well, listen, I just look at the wheel of efficiency, which CPU ultimate programmability on one end, ASIC ultimate efficiency, harder to program, and there’s all this other stuff in between. But what’s happening with Gaudi2? Where’s all of this action coming from?
Sandra Rivera: Well, so I would kind of dial the clock back to the strategic acquisition that we made with Habana Labs, which we’re coming up on four years now, and just the execution machine that we have with that organization and the fact that we delivered Gaudi1 on time, on budget, in spec. Gaudi2 similarly, we actually delivered that product last year, but it’s had a resurgence in terms of the amount of interest, given all of the generative AI and large language model interest that obviously really launched last fall. And we have been making excellent, steady, methodical progress with that product.
If you look at the MLPerf results from last November to this May to just the ones we published last week, just clearly beating the market leader, A100, which is the most pervasively deployed GPU today, handily in terms of raw performance, in terms of throughput, time to train, as well as cost per token, and beating on power efficiency.
When we look at H100, particularly with the FP8 data format that we released the software for, A100 doesn’t have it, H100 does have it. From a price performance perspective, we still beat H100 because, as we know, the price differential for the premium product in the market is quite substantial.
And not every customer needs peak roofline performance for every single workload. And this is where we think we’ve got an excellent alternative for customers and a lot of appetite to give it a try. And especially when we look at what we are putting into play in the Intel Dev Cloud, just creating that sandbox for customers to come in, to try their models out, to see the ease of portability and the power efficiency, the time to train, the cost-effectiveness of a Gaudi solution, we’re pretty excited about that.
I will say that in addition to that, if you notice the only company that submitted CPU MLPerf leadership results was Intel with-
Patrick Moorhead: Very much noticed that.
Sandra Rivera: Yes, with our Sapphire Rapids 4th Gen Xeon, which, of course, has integrated AI acceleration capability.
So I want to go back to your comment. Silicon development cycles are long and complex endeavors. And the good news is that we never stop innovating. We never stop inventing. Greg talked about how sometimes it takes time to get things right. We’re actually trying to speed that up a bit and get to market faster. And just the amount of progress we’ve made over the past year is pretty phenomenal. And yes, we’re getting a lot of incoming interest on, “Hey, what is this thing Gaudi2? Hey, maybe 4th Gen Xeon is something that I could deploy” for many of the classic machine learning workloads. Certainly all the inference-type of opportunity, which is the highest growth opportunity that we see in the market. And the more traditional enterprise OEM go-to-market sales motions that we have, given that’s still a very strong share position for Intel with our CPU.
Patrick Moorhead: I do like the variability in the silicon. And again, CPU, ASIC. Oh, by the way, CPU with some ASIC blocks, which you very much have in Xeon. GPU and GPU-like architectures are from a flexibility standpoint. FPGAs play in this game as well. And I think it’s ignorant to talk about just one way to approach this, especially when you’re looking from end to end, from edge to cloud and everything in between. And in the end, it depends on what you’re trying to do. And right now, you are the only company, and this is just a fact, not anything else, that has that variability, that has every one of these bases covered. And I know it’s easy to say it from an analyst’s perch, I get it. I did have a real job for multiple years. I do understand that, too.
Daniel Newman: I resent that. I feel like we do work pretty hard.
Patrick Moorhead: Well, I got to tell you, advising and talking about what happened and why, I personally have found it a little easier than actually doing it.
Daniel Newman: Oh, absolutely, absolutely.
Patrick Moorhead: Anyways-
Daniel Newman: It’s interesting everything you said because I went off script and I was asking Greg questions about open and I was kind of trying to go that path, and then you started kind of mentioning one type of silicon to rule them all in AI. And it’s interesting how we’ve sort of come to that, as well. It’s like one type of framework. Nope, that’s not how it’s going to end up. Everything’s not going to be on a GPU. We see the future.
Patrick Moorhead: How about that?
Daniel Newman: And by the way, when we do, we tell everybody, and when we get it wrong, no one ever hears about it again.
Patrick Moorhead: We don’t bring that up afterwards.
Daniel Newman: Greg, I want to come back to a little bit about the software and the frameworks and the development side. This event tends to drive a lot of presence from developers, and they’re all here. You started, alluded, I heard something about a cloud developer. Talk a little bit about how Intel is going to really make it possible for developers to embrace open and make open maybe the most accepted path forward for the future of AI.
Greg Lavender: Well, I think again, developers more and more are looking for productivity and efficiency of their time. And so it’s really about they want to consume platforms and they don’t want to consume just one platform, depending on whether it’s inference, whether it’s training, whether it’s large language model training. There’s lots of normal training that goes on that happens on a Xeon CPU, by the way, because even with not large, large language models, but larger dataset sizes. And by the way, everybody still has to do all their data processing, data management, structured data tagging, get all stuff prepared before they feed it into their GPUs for training.
But I think that developers, again, gravitate toward open ecosystems where all the pieces play well together. And that’s really what our one API story is. It’s not one API, it’s a collection of technologies, libraries, tools, runtimes, et cetera that you can put together to do things in a heterogeneous multi-platform way.
And I think with regard to the CUDA lock, if we just put that one on the table, I think there’s a couple of things happening in the industry that are going to disrupt that. And one is Triton, which OpenAI originally brought to market, it’s open source, we’re contributing to it. In my demo earlier today, I gave a demo of using Triton on the PyTorch to basically take a workload and run it on our Max GPU, what is known as Ponte Vecchio, with very little code change because we use MLIR technology, which generates the code for the kernels. And you don’t actually have to write in sickle or you don’t have to write in CUDA. It’s all you write in Python and the code is automatically generated for you. Same thing is happening with OpenXLA and JAX in the Google ecosystem. And the same thing is happening with Mojo from Chris Lattner’s team.
So I think the language world, it’s all high level programming languages in Python, there’s your productivity. Okay, now you want the performance. Well, you don’t need to be a performance head to get the performance. The compiler and the runtime ecosystems now will deliver that for you across multiple architectures, whether it’s an Nvidia or what have you. So proprietary lock-in above the driver layer, I think, is over. This will take some time, but openness is going to win.
And we just announced this foundation, the Unified Acceleration Foundation (UXL) through the Linux Foundation with several customers like Google and Arm and Samsung and others participating in this, which is to drive these open standards and to drive open source ecosystems to level the playing field for everybody, and let the best hardware win.
Patrick Moorhead: Having met with your customers, your customers’ customers, I mean, this is exactly what they want now. And I’m hopeful that that gear locks in with developers and they make it happen. I mean, the industry needs it at this point.
So I like to talk about the tech industry having a certain sense of reality, what’s going on, and then sometimes it’s covered in the press. And if you go and you measure what’s being talked about, there seems to be this fascination with LLMs. How many parameters? Is it open? Is it closed? Who is it actually open to?
As opposed to, let’s say, enterprise value where let’s just call a spade a spade here, I mean, we’re 14 years in to the public cloud and still 75 to 90% of the data is still on-prem inside of the data center. And sure, enterprises can immediately take advantage of generative AI through SaaS models. We’ve seen ERP vendors, we’ve seen CRM vendors who are GA with the first generation of this, but still that doesn’t include all that data that I’m talking about that’s on-prem.
And I’m curious, Sandra, how will enterprises activate generative AI to take advantage of all this data that’s either sitting on their own edge, maybe in a manufacturing site, fast food restaurant, and also inside of their own data center. How do you see that playing out?
Sandra Rivera: Well, to your point, there is the data gravity that exists in the enterprise and the data wants to be processed at that point of data creation and data consumption.
Patrick Moorhead: Exactly.
Sandra Rivera: And that is typically on-prem or clearly a hybrid model. And so what we’re seeing is this very strong affinity to taking a foundational model, clearly training on that model, using some of the goodness of that model, but then contextualizing the data set that you have on-prem, which is a much, much smaller data set.
Patrick Moorhead: So it’s grounding it.
Sandra Rivera: Exactly. And then doing that fine-tuning and clearly the deployment that happens on-prem and on that edge type of platform.
And when you look at the size and scope of those clusters, they can be as small as two to four servers up to 24 to 32 nodes, these x8 nodes. But that’s all within the purview clearly of what we have with our CPU, clearly with our GPU, and certainly with the AI accelerators. And then of course being fortified by our own IPU capability, our FPGA capabilities.
So I think that what we’re going to see is this very big growth in the enterprise use cases that will require trust and security, a lot of the things that Greg talked about in his keynote, as well, securing the models, securing the data, whether that’s running in the secure perimeter in the cloud for some of that foundational training or actually being deployed on-prem. And we just see a huge opportunity and probably the biggest opportunity for growth really happens on that edge, pre- type of footprint.
And our job, going back to Greg’s comment is if we agree, and I think we do, that it is the early days of AI, that the workloads are broad in ranging in terms of their complexity, their size, the multimodality, the latency requirements, real-time nature, batch nature, training versus inference, that it is not a one size fits all that we’re going to have heterogeneous architectures. And our job is to address all that complexity through this homogenizing layer of software, to protect the developers’ investment in their application software by hiding all that complexity, back to Greg’s point. We are close to the metal. We do understand how all those bits work and all those transistors work and how to get the most out of it, but we try to present a very elegant and simple software interface to the developer.
Greg Lavender: Can I add on the security topic? I have this saying, which I sort of revealed in my t-shirt reveal at my keynote, was the sort of security for AI because we want to secure our models and our data. Yes, you secure the code, too. Spend a million dollars to train a model, you deploy at the edge, somebody steals it, that’s not a good thing. So trusted confidential computing is really critical for the edge, the client, the data center, the cloud.
And then there’s also AI for security, which is what people don’t realize this, we have Intel threat detection technology in every client CPU we sell. That’s all using machine learning algorithms to look at instruction streams going into the CPU to fare it out whether that matches a signature that meets some ransomware or malware signature. And we can basically block that and then notify Windows Defender, if you’re running Windows, to take action against that.
So we use AI in our fabs to improve our yields, the quality of our chips or our dyes. We use AI across the company. So this isn’t a new thing for Intel. It’s a new thing in the market, in some ways. And I really think, too, this large language model trained, to your point, is consuming a lot of CPU resources and GPU resources, power resources. Only a small number of people can afford the capital expenditure to do that, which is why we can afford to build our own Intel Developer Cloud. And as you know, the investor community is pouring billions into this and the revenue’s still got to show up there.
But I think the edge in and client, the client, your experience on a PC, your experience on a phone, those are going to change dramatically. And all that computational power that we have there that’s latent on your phone is going to get used by this. And we’re going to continue to drive AI technology, both hardware and software, ubiquitously through our product line. Even our P-core, E-core scheduling algorithm is using AI.
Daniel Newman: Right. That’s great to hear how you’re using it within the company. I think people always want to know that you’re drinking your own champagne.
Greg Lavender: Yep.
Daniel Newman: Greg, Sandra, thank you both so much for joining us here.
Sandra Rivera: Thank you so much.
Greg Lavender: Always great talking to you guys. Appreciate it.
Daniel Newman: Let’s do it again. For you, twice in a week. And Sandra, we love having you here so we’ll have you back soon.
Greg Lavender: Thanks.
Daniel Newman: All right, everybody, hit that subscribe button. Join us for all of these sessions here at Intel Innovation 2023 in beautiful San Jose, California. More from Patrick and I, as always here on the Six Five. But for now, we got to go. See y’all later.