The Next Wave of Gen AI Runs on Cloudera – Six Five On the Road

By Patrick Moorhead - November 21, 2023

On this episode of The Six Five – On The Road, hosts Daniel Newman and Patrick Moorhead welcome Cloudera’s CEO, Charles Sansbury, during Cloudera Evolve NYC for a conversation on how the next wave of generative AI runs on Cloudera, and how the company plans to drive this wave of innovation.

Their discussion covers:

  • The key to Cloudera’s view of how the world is going to evolve, as informed by their customer conversations
  • Hybrid architecture and the importance of mixing on-premises with cloud-based data
  • What kind of limiting factors there are to deploying generative AI solutions with their data and how Cloudera is helping customers overcome that
  • How the company is approaching growth and what the future holds at Cloudera in terms of partnerships, investments, and capabilities on the horizon

Be sure to subscribe to The Six Five Webcast, so you never miss an episode.

Watch the video here:

Or listen to the full audio here:

Disclaimer: The Six Five webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.


Patrick Moorhead: The Six Five is live, and on the road in New York City for Cloudera, Evolve 2023. We’re talking about some of our favorite topics, a lot of AI, a lot of generative AI, and data. Because we all know that you have to get your data estate in order before you can do any of these incredible tricks with analytics, AI and now generative AI. Dan, it’s been a great show.

Daniel Newman: It has been a great show. It’s been good to spend the day here. We’re looking over the beautiful Hudson-

Patrick Moorhead: Yes.

Daniel Newman: Here in New York City. But more importantly, we’ve had the chance to hear from, not only the Cloudera executives, we’ve been hearing from customers, we’ve been hearing from partners. And I think what I can say confidently, Pat, is that I’m going to be able to walk away today with a much clearer understanding of what Cloudera is doing. Having said that, I sure do like the opportunity to ask, and so maybe we save the best conversation for the last conversation, although maybe everyone out there is watching them in different orders. But what do you think?

Patrick Moorhead: So, with that, let’s introduce the CEO of Cloudera. Charles, welcome back to The Six Five.

Charles Sansbury: Thank you, thank you. Who are you guys talking to after me?

Patrick Moorhead: Gosh, we have talked to… It depends on how people want to-

Charles Sansbury: Well, I just meant, if the best is yet to come, I’m just curious who’s next?

Patrick Moorhead: It’s you.

Charles Sansbury: Then-

Patrick Moorhead: You are-

Charles Sansbury: Well, I appreciate that.

Patrick Moorhead: You are that. We have talked to your product leaders.

Charles Sansbury: Yep.

Patrick Moorhead: We have talked to your customer leaders, we have talked to your partners. We talked to AWS, so it has been great, so far, and we thought the best for last, and that’s you.

Charles Sansbury: Well, I appreciate you saying so. I might disagree with you, but nevertheless. I mean, look, this event has been about more than Cloudera, people talking more about customers telling their stories of what they’ve done with the technology.

Daniel Newman: Right.

Charles Sansbury: I still count as the new guy, I’m still relatively new to the company. So what was so kind of empowering or exciting for me is hearing what some amazing companies, brands whose names you recognize-

Daniel Newman: Yes.

Charles Sansbury: Are doing with our products. And you think there are two or three, in other words, a room full. So, all-in, just a great event, and we’re super happy to have pulled this off and pulled it off in a way that I think is a first class event. So, thank you guys for coming.

Patrick Moorhead: Yeah, we appreciate being part of it too. This is our second year in a row, so.

Daniel Newman: Yeah, and I love the humility. The dad joke, the “It’s not me best for last.” It sounds like something I would-

Charles Sansbury: No, it’s definitely true-

Daniel Newman: I would-

Charles Sansbury: I’m not the best.

Daniel Newman: We’ll see. So, by the way, first big keynote, right? Since-

Charles Sansbury: Yes, yes.

Daniel Newman: Other than internal.

Charles Sansbury: Yes.

Daniel Newman: The first big external.

Charles Sansbury: Correct.

Daniel Newman: So always a seminal moment in terms of any CEO’s tenure, but also in a company that’s in such a state of transition. You’ve come in, you’ve made some important changes, you’ve put your vision out there and you actually did it with us, which was really appreciated. But having said that, you kind of made a comment of “the next wave of generative AI.”

Charles Sansbury: Yes.

Daniel Newman: “Runs on Cloudera.” Now, there’s no shortage of companies right now talking about how they’re going to be the driver of generative AI, so let’s put you on the spot here, Charles, and say-

Charles Sansbury: Yeah.

Daniel Newman: What do you mean and how is Cloudera going to drive this wave?

Charles Sansbury: Well, so key to our view of how the world’s going to evolve, and this is informed by customer conversations, not just us. We believe that the world’s going to evolve to a hybrid data infrastructure that powers AI applications. What does that mean? AI grew up in the cloud, right? You had these large language models that fed on a whole bunch of mostly publicly available data and they do amazing things. I’m a big fan of ChatGPT, and I said during the presentation, I actually used it to create a first draft of the letter I sent to the company when I joined Cloudera. And it got some things wrong, in terms of where our headquarters was-

Daniel Newman: Okay.

Charles Sansbury: And I fixed that. But the point that we’re trying to make is the next generation is going to incorporate both that cloud-based data, but really more tuning models with your own data because you get better data, better business outcomes, better results from the models. And we believe that we’re the only company that’s thinking about the magnitude and complexity of the challenge, you’re bringing together across on-prem, private cloud and public cloud, the ability to manage workloads and pull data together from all of those environments and bring it into one open data lake. And again, you heard a lot about Iceberg today, powered in many cases by Iceberg.

Daniel Newman: Right.

Charles Sansbury: That’s a unique capability. And our belief is that those higher level value added applications, not just the nifty parlor trick of what ChatGPT does when you or I log in, that’s the next generation. And so embedded in that comment is the importance of hybrid, the importance of cloud-based technologies which are necessary but not sufficient. And then our commitment to working with open standards and open source products as they become available, so we make the best technology available to our customers.

Patrick Moorhead: So you’re in a great position, I think, and I thought you were in a great position with ML, but when it comes to generative AI, there are certain things that do change. I mean, you have 25 exabytes of data under management that you talked about today and that is absolutely mind blowing. Most people talk about petabytes-

Charles Sansbury: Right.

Patrick Moorhead: And you’re talking about exabytes. And the other thing is that this is mostly, today, on-premises in the data center, type of information, maybe some is on the edge. Yet sometimes we think about, “Hey, the only place you can do any of this is in the public cloud.” You had customers who were on stage who talked about using a hybrid data architecture and morphing that into a hybrid AI implementation. Can you talk about some examples of that and this next generation-

Charles Sansbury: Yeah.

Patrick Moorhead: Generative AI?

Charles Sansbury: The first part that I want to address is just the idea of the importance of mixing on-premises with cloud-based data. We’re seeing customers do that for a number of reasons.

Daniel Newman: Right.

Charles Sansbury: Some of them are security, manageability, also cost. One of the coolest examples I’ve seen, is a pharmaceutical development firm and they have embedded in their on-premises data infrastructure, 30 years of clinical trials data, spread across a dozen geographies, lots of languages. They have some rows and table format data, some unstructured data, they’ve got research papers written in multiple languages.

Daniel Newman: Right.

Charles Sansbury: They’ve basically loaded that into a data lake. They’ve indexed it such that now they can tie a specific organic compound to an impact on a specific gene. So they know that this compound does this to this gene, they know these genes are operative in some orphan diseases. Diseases that they will never fund a research project for, because there’s no money in doing that.

Daniel Newman: Right.

Charles Sansbury: And what they’ve been able to do with this structure is to lay that across their data and surface up hundreds of thousands of potential compounds that could then have an impact on a specific gene, which for an orphan disease, which is driven by this gene-

Daniel Newman: Right.

Charles Sansbury: Ultimately could have an impact. And they’ve narrowed that down to the point, they’ve entered, it’s not a clinical trial, it’s an approved test of existing drugs that have been tested on humans that work on a certain gene that they’re using to try to address some of these orphan diseases. To be super dramatic, in that case, the technology is potentially saving lives.

Daniel Newman: Right.

Charles Sansbury: But moreover, I think it’s a super cool use case of data that existed, finding new ways to do it in ways that are highly valued and you couldn’t do it without one, the hybrid architecture that we enable, and two, the data engineering that’s required.

Daniel Newman: Right.

Charles Sansbury: The rough and tough work of building that foundation to allow them to do that. It’s one of my favorite use cases.

Patrick Moorhead: Yeah-

Daniel Newman: It’s really-

Patrick Moorhead: And it sounds like it’s something that only can be done with generative AI, because I think of machine learning and the capabilities, it’s the multivariate and the model complexity that generative AI brings to the table here.

Charles Sansbury: Exactly. That’s-

Patrick Moorhead: Okay.

Charles Sansbury: That’s what I’ve been told.

Daniel Newman: It’s really interesting because the hybrid architecture sometimes gets underplayed, but the data estate of most enterprises has, you talk about this a lot, probably even more than I do, but the vast majority of the data is on-prem and it’s being created every day, every minute at the edge, of course in the cloud, on-prem data center. And what Cloudera is really trying to do is, solve that problem and I’ve been very positive on how you’ve been doing this with adding the vector and the unstructured. Because I think a lot of people knew that if you just had rows and columns and tables, databases have been really good at that for a long time, but this unstructured is what’s being created, video, this.

Charles Sansbury: Right.

Daniel Newman: And someday you could have this conversation, you have your important meetings, you want to be able to index that, you want to be able to use that and be able to generate with that. That’s very hard and when a lot of that data is not sitting in there in the cloud, that becomes really complicated. So you kind of started talking about the question I was going to ask you. So there are hurdles, there’s a lot of reasons that this is hard. Talk about what kind of the limiting, what are the gating factors to getting this done and how are you helping the companies overcome that?

Charles Sansbury: Well, I think we’re very early in the innings of people using data they have in their on-prem or cloud-based systems and moving that toward generative AI. I think right now we have lots of fun things to look at, but not that many tangible use cases where people can say, “I’ve got business value.”

Daniel Newman: Right.

Charles Sansbury: And the problem is, it’s expensive. So it’s expensive to feed data into a large language model for a long period of time, whether you’re using a public cloud or your own on-premises-

Daniel Newman: That’s right.

Charles Sansbury: Infrastructure. And so I think one of the several limiting factors right now is, cost. The second thing is, while boards are saying, “We need to use AI to run our business better, what are you doing?” From a management team’s perspective, CEOs are saying, “Well, we have some cool ideas, but we’re not sure which are the early quick wins that we can go get.”

Daniel Newman: Right.

Charles Sansbury: Which, from a management team’s perspective, will validate further investment. So right now in this formative stage where I think people are excited, but we don’t have as much kind of tangible progress, which is why I think this event was so cool, because we had a bunch of customers who’ve talking about things-

Daniel Newman: Yes.

Charles Sansbury: They’ve already done. I would argue there was as much real business value communicated by our customers today as any event that’s happened in the industry that’s trying to talk about the progress of AI and so that’s why… I wasn’t a part of this when it happened, but you can’t help but be proud of the work that our customers and our team has done to get to this point.

Patrick Moorhead: Yeah, I’ve really enjoyed the event. If nothing else, getting more specifics on how you’re enabling your customers to do all these great things with AI and generative AI and bringing in a bunch of new partners to help you do that. Names that I know you’d worked with in the past, but it seemed to me much bigger agreements with them, right?

Charles Sansbury: Yes.

Patrick Moorhead: The Pinecones, the AWSs, and the NVIDIA, really becoming more of a fabric part of the solution. So that was today, and I know the work always continues, but I wondered if you could give our viewers, our listeners an idea of what to expect in the future for Cloudera? And I don’t want to limit you or bracket you, but whether it’s investments, whether it’s capabilities-

Charles Sansbury: Yeah.

Daniel Newman: It’s not-

Patrick Moorhead: That you’re thinking about right now.

Daniel Newman: You’re not public anymore, so you can give us all-

Charles Sansbury: I can say whatever I want. Exactly. There’s no shareholder concern and half our shareholders were in the room today, so I’m fine.

Daniel Newman: Yes.

Charles Sansbury: No, so it’s a really good question and maybe taking a big step back. I was asked by the investors and brought in to help the company realize its opportunity to accelerate growth.

Daniel Newman: Right.

Charles Sansbury: And I think you saw today, we have a great collection of building blocks, right?

Daniel Newman: Yes.

Charles Sansbury: Technology, customer use cases, partners. Our challenge is optimizing or realizing that opportunity. The things that we have done since we joined and have gone through kind of an investment cycle, we’ve accelerated a bunch of investment, primarily around technology and go-to-market. I do think that one of the reasons why you saw the level of customer, first of all the scale of customers you saw on the level-

Daniel Newman: Right.

Charles Sansbury: Was because people have figured out, “My goodness, the company has a hundred times more data under management than other companies in this space. And if that data under management is the precursor to building generative AI, then we need to work more closely with Cloudera.”

Charles Sansbury: I’ve actually had that conversation with a number of companies you saw on the stage today.

Daniel Newman: Right.

Charles Sansbury: We’re also making the investments around accelerating some of our work around Iceberg, accelerating some work around our ability to work with the large, both open source and private models.

Patrick Moorhead: Right.

Charles Sansbury: And all those things are expensive and time consuming. But I think what’s great is, we have a group of investors that are willing to fund those investments. They’re going to pay off not in a week or a month or a quarter, but in one year or two years. So I think we’ve-

Patrick Moorhead: Right.

Charles Sansbury: Been able to really accelerate a bunch of investment around our products and around our go-to-market. And the other thing we’ve done is, we’ve started to increase our investments in branding.

Patrick Moorhead: Right.

Charles Sansbury: So you’ll start to see from us, primarily web-based branding and marketing, which is something we haven’t done in years.

Daniel Newman: Right.

Charles Sansbury: And I think that the company, the market opportunity situation, where it merits those investments-

Patrick Moorhead: Right.

Charles Sansbury: And so we’re making them.

Patrick Moorhead: Excellent.

Daniel Newman: That’s good. And we, obviously, have been following this for a long time, Charles, and this is a big moment and I think getting it right here could create real tailwinds for the company. I think these companies are starting to realize the amount of complexity and having organizations like NVIDIA and AWS stepping up and kind of walking in holding hands, saying side by side, “We believe in what Cloudera is doing.” And they can help you to solve your biggest data complexities in order to drive generative AI capabilities into your organization. Those were very positive outcomes from today’s Evolve event.

Charles Sansbury: Yeah, we feel the same way. Thank you for saying that. I think the challenge is, one of the things that we do well is help customers solve their most complex problems. It’s not always the best thing to accelerate your business if you start talking about-

Patrick Moorhead: Right.

Charles Sansbury: How complicated what you do is. And so we’re trying to find the middle ground of, there’s hard work to do here, we’re spending money and engineering effort to make it easier. But to your point, data engineering at its core is some difficult blocking and tackling. You have to do that work to optimize your success in AI. And I think now people are starting to realize that that’s the work they have to do a good job in the long term, or at least we’re hopeful that’s the trend we’re starting to see.

Daniel Newman: Well, it seems like you’re on the right path. And Charles, I want to thank you so much for joining Patrick and I here on, The Six Five.

Charles Sansbury: Thank you guys for taking the time.

Patrick Moorhead: Thanks.

Charles Sansbury: We appreciate it.

Daniel Newman: Let’s do it again soon.

Charles Sansbury: Absolutely.

Daniel Newman: All right, everybody, you heard it here. Charles Sansbury, CEO, Cloudera, joining us here on The Six Five On the Road at their Evolve 2023 event in New York City. Hit that subscribe button, check out all the content that we created here at the event. And of course, all the interviews on The Six Five. For this episode, for Patrick Moorhead and myself signing out.

Patrick Moorhead: See y’all later.

Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.