The Six Five On the Road at AWS re:Invent 2023 with Harshal Pimpalkhute

By Patrick Moorhead - December 11, 2023

On this episode of The Six Five – On The Road, hosts Daniel Newman and Patrick Moorhead welcome Harshal Pimpalkhute, Senior Product Manager, Generative AI, at AWS for a conversation on the latest announcements from Amazon Bedrock made at AWS re:Invent.

Their discussion covers:

  • A recap of the announcements made on Amazon Bedrock at AWS re:Invent and what’s new with customization
  • An explanation of the new retrieval augmentation capability to connect foundational models to company data sources
  • Agents for Bedrock and how this will most benefit customers
  • The new guardrail capabilities

Watch the video here:

Or Listen to the full audio here:

Disclaimer: The Six Five webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript:

Patrick Moorhead: The Six Five is live and on the road at AWS re:Invent 2023 in Las Vegas. We are in the Frequency-

Daniel Newman: Future Frequency.

Patrick Moorhead: Future Frequency container. It is a real shipping container here and we have been having some great conversations. I mean, we’ve been talking about compute and silicon and server lists, I mean, and the big thing, generative AI, right? We can usually only go 15 seconds before it comes into the conversation. It’s been a huge topic here at Reinvent.

Daniel Newman: You should have thrown me for a loop. We were going to talk about the big thing, which is quantum security or something like that. But yeah, I mean, look, I think right now if you’re at a technology event and the number one thing people aren’t walking away with is a very, very concrete idea of your AI strategy as a technology vendor supplier, ISV, SI, whatever you’re doing, you’ve probably missed the major market trends unless you’re trying to really be a niche player and AWS is not even close. I mean, they’re the biggest in pretty much every category in which they’re competing right now when it comes to infrastructure cloud.

Patrick Moorhead: That’s right. And with generative AI, there’s a lot of ways for customers to get access, that there’s the infrastructure layer. There is even some, I’ll call them PaaS and SaaS capabilities for, let’s say, text and things like images. But at the foundation is Bedrock. And that’s what we’re going to talk about today. Harshal, welcome to the show.

Harshal Pimpalkhute: Thank you.

Patrick Moorhead: And congratulations on the big announcements. You manage Bedrock, so you’re the guy here at the show.

Harshal Pimpalkhute: Yes, thank you for having me. Great to be here. Harshal Pimpalkhute. I am part of the Amazon Bedrock product team. We have a whole set of talented individuals across engineering and science building and delivering the product. Good to be here.

Daniel Newman: I think they could call them Bedrock models instead of foundational models because actually saying like a Bedrock foundational model, it’s duplicitous, right? Because foundation and Bedrock are, there had to be something there. I’m just throwing that out there

Patrick Moorhead: Dan, you’re just –

Daniel Newman: I’m always here to tear apart or create something that is architecturally sound and it works.

Patrick Moorhead: And it works.

Daniel Newman: Yeah, so listen, I mean it was a big week on the Bedrock front. I think the company made a huge stride. I think some of the perception of late to market. This is quickly in the rear view, I think when it comes to enterprise and of course bring your data to the workload AWS had this massive inherent advantage anyway because you have by far the most workloads. Give us a little bit of just the rundown of this week’s announcements and then some of the customizations that are now available at Bedrock.

Harshal Pimpalkhute: Absolutely. So a big week for Bedrock and generative AI overall. We had quite a few things that we were working on over the last few weeks, few months I want to say, that went live. And if you think about Bedrock, you made a comment earlier about foundation models on Bedrock, a play on that, we really want to provide multiple foundation models on our service. Choice and flexibility is really the key. We have seen from customers, working with customers, that there isn’t really a one size fits all over here, not a single model that can solve all use cases. In fact, in some use cases, you may need multiple models. So, what you’ll see as you go through the announcements last week, or sorry this week, is there is a choice that we want to provide. There’s customization that you referred to and then the integration of these models into your application. And we want to have this in a really easy to consume manner through a single API. Bedrock really is a service that makes it really easy to build and scale applications. So, on the customization specifically, we have a couple of things going on. We made fine tuning generally available. It’s also available on a couple of additional models beyond Amazon Titan. So it’s available on Coherent Command and on Meta Llama, so that’s one. And second, we made continued pre-training available in preview. So that’s what’s going on on the customization of models side.

Patrick Moorhead: Yeah, it’s a big deal. And you have different ways that customers can solve problems with AI. You have the infrastructure and then for ML you have SageMaker and Bedrock is for Gen AI. But just for the sake of education, bedrock is a managed service, right?

Harshal Pimpalkhute: Yes.

Patrick Moorhead: And one of your goals that I feel like you’ve had through the entire history of the company, starting at Amazon.com, is also making it simple, making something … And by the way, none of this is simple, but making it as simple as possible where we are today for folks. And one of the big questions that I get is, okay, wait a second, how do I get a better outcome through my own custom data? And we’ve seen RAG, retrieval augmentation come in with Bedrock. How do I use RAG to connect my data to your FMs and open FMs that are supported by Bedrock?

Harshal Pimpalkhute: Yes. A couple of different ways. The theme of choice permeates through across Bedrock. So retrieval augmented generation, or RAG, is emerging to be a very popular way that customers want to tap into their company data sources, their knowledge bases, so that they can augment the responses with meaningful information that is customized and make a response that is customized for their company policy or company theme. So, a couple of different ways that you can use RAG on Amazon Bedrock. One is you now have an API, a retrieve and generate API, that takes in input as a search query along with the embeddings model and a foundation model and looks up the information and surfaces a response to you. So that’s one way to do things. And the other way is you can create an agent too that includes a knowledge basis and surfaces information. So those are the two different ways you could use RAG.

Patrick Moorhead: Interesting.

Daniel Newman: So, let’s talk about the agent part because I think people are hearing a lot about RAG, but creating the agent, being able to source that information in real time. I mean, look, this is all about the ability to differentiate with the data you have. Because basically what we found out very quickly in this LLM race and everything is that the open available models are increasingly democratized. They’re easy to access and everybody can access them. Companies- It’s what Bedrock’s actually done incredibly well. But companies have this unique proprietary dataset and they’re trying to get access to it. And you just talked a little bit about RAG, you talked about the agent side of it, but these are the methodologies of how companies can get at that high-value data to offer solutions. So give us a little bit of that background on the agent strategy.

Harshal Pimpalkhute: Sure. So agents, think of agents as microservice code that you can use to extend foundation models and actually invoke APIs and look up information using data sources.

Patrick Moorhead: Are agents real time?

Harshal Pimpalkhute: So agents, you can interact with agents real time. And there are a couple of use cases where real time interaction, for example chat bots is one or virtual assistants, but agents are really also helpful as digital assistants, so you can automate your tasks. And the way the agent does this is by breaking the task into multiple sub-tasks or multiple steps and coming up with an orchestration plan to figure out the steps, figure out the sequence, and then actually invoke the right APIs behind the scenes. So that’s really the mechanics under the hood, what’s happening. From a developer point of view, you just provide a simple natural language instruction, tell it which APIs it has available and how it can access those APIs, and maybe a couple of knowledge bases to look up information. And that’s pretty much it. Behind the scenes, we will do the prompt engineering and come up with the entire orchestration plan and have it ready for end users to interact with. So yeah, that’s how agents really, it’s how we are designing it.

Patrick Moorhead: So what’s the big benefit? Is it speed? Is it simplification? Is it getting more done? It’s an efficiency play? What is the end result for your customer, for Bedrock?

Harshal Pimpalkhute: Yeah, so let’s talk about it from two angles. What does it mean for the developer? And then let’s talk about what does it mean for the end user? For the developer, there are really three things that we provide. There’s control, there is visibility and then there is the orchestration, or these secure platforms to do the orchestration. So, on the control side, you have the engineered prompt that you can now use to further refine and further enhance the end user experience, so that’s one. That’s the first one for the developer. The second piece is the visibility. As you might be aware, customers want to know if you’ve come up with a specific answer or you executed a certain task, why did you do it? That’s an integral part of building an application. That’s an integral part of debugging. So we provide developers with the visibility to step through these tasks, why the agent called what it did. And the third piece is really the security. Security is job zero at AWS and with agents, your data, and with Bedrock, your data is always encrypted in transit and at rest, and we provide a very secure way to do these operations. So that’s from the developer perspective. From an end user perspective, you can automate tasks. For example, processing insurance claims, or retail order management. Those are the kind of tasks that you will be able to automate with a very simple development environment, which is-

Patrick Moorhead: And these are actual end users, let’s say frontline workers maybe-

Harshal Pimpalkhute: Yes.

Patrick Moorhead: … in a retail store, or somebody working administration behind a desk. And they really don’t have to-

Harshal Pimpalkhute: Absolutely.

Patrick Moorhead: … understand the technology. It has to be set up by-

Harshal Pimpalkhute: Exactly.

Patrick Moorhead: … IT and folks and then away it goes.

Harshal Pimpalkhute: Exactly. So, let’s take an example of an insurance claim processing. Now, the IT as you referred to would set up this agent with the steps that I described earlier. And then the end user, in this case it’s someone working in the agency, like an insurance agent working in the agency, who interacts with this generative AI application and says something like, “Hey, can you tell me what are the pending claims from as of first of this month? And send them a reminder to turn in their paperwork.” So, you can automate these really mundane tasks really quickly.

Patrick Moorhead: CRM. CX.

Daniel Newman: Yeah, and to be really clear, there was about a three or four-year period of time where there was this RPA buzz and it’s been really interesting to sort of watch that space get disrupted really, really quickly. And it’s not … You know, look, I’m not being cynical about it, I just mean those things were done using, it was a brute force kind of approach.

Patrick Moorhead: Analytics and maybe machine learning.

Daniel Newman: But if you kind of think about in an SAP workflow, if you work in procurement, there’s typically seven, eight, nine different screens that need to be, someone goes through. And RPA did a lot of screen scraping and did a lot of things in the hard code level that can make things automated. But in this world, the ability to be dynamic is so much more important. And generative AI has just stepped in and said, “Look, thanks for coming. Here’s your gift bag, we’ll see you later.” Because that’s what you’re really doing is you’re giving people the ability to move to these intelligent processes very, very quickly using generative capabilities. I want to pivot with the little bit of time we have left. Harshal, thank you so much for spending some time with us here. And I want to talk about something that was really hot in the LLM world early on, and that was the reliability, the safety, the guardrails. Can we trust? Can we trust the process? Can we trust the information? Can we trust what’s being generated? And this is probably the biggest hiccup. We’ve seen indemnification clauses coming down left and right. We’ve seen companies, we’re starting to see testing going on about fidelity, the fidelity of these things and comparing models. And Bedrock’s obviously open model, so it’s not about … People can look at those things and deploy whichever one they want, but the company was very steadfast about talking about guardrails. Can you talk a little bit about some of the guardrail announcements here and how the company’s approaching making sure it’s safe and trustworthy?

Harshal Pimpalkhute: Absolutely. And thank you for bringing that up. As you pointed out, safety and making these available in a responsible way is top of mind for a lot of our customers. In fact, all of the customers. And we’ve been focused on it. And as you may have seen in Adam’s keynote, we announced Guardrails on Amazon Bedrock. And Guardrails, just to double click on it, is the ability in the hands of the developer to steer the interaction with the foundation model based on the company policies, and apply filters or content filters so that it’s tailored to their specific use case. So, a couple of things really quick that we launched this week. One is a denied topic guardrail. So think of, let’s flesh out the example, the same insurance agent example. When people are interacting with a virtual assistant, you want them to be focused on the task and not deviate. You don’t want this assistant giving out financial advice, for example. So you can create a guardrail, let’s call it a financial advice guardrail, which tells, “Do not talk about stocks, do not talk about 401k, do not talk about portfolio balancing.” And similar, you can provide these as utterances, what we call it. And using these set of inference, Bedrock creates a guardrail, a denied topic guardrail for you, and you can apply this guardrail as part of your inference. So, going back to the agent example, if you apply this guardrail to the agent and after the application, if you ask it something like, “Hey, which stocks do you think I should invest in?” It is going to come back and say, “I’m sorry, I can’t answer that question. I can help you with any insurance claims processes.” Or something like that. So you can steer the conversation with these guardrails in a certain direction based on the use case, based on the company policies that you would have in place.

Patrick Moorhead: This is great. So, I’d like to say that I learned everything in the pre-briefings and the keynotes and the slide decks and the press releases, but what I just learned is that Guardrails is customizable.

Harshal Pimpalkhute: Absolutely.

Patrick Moorhead: So if I had a healthcare company versus a sports chat site, there are probably going to be a different set of guardrails. And there might be … And this is great, this is really exciting. Because sometimes there were some generic rules where the company did the filtering and different countries, different use cases, different workloads.

Daniel Newman: Syntax and …

Patrick Moorhead: Exactly. And this is really exciting. And it’s funny, I think it’s the first time I’ve been excited about something like this, but this is exactly what customers need because they want to be the ones to have the bells and the whistles and things like that. Otherwise they’re not going to use this because it’s too big of a liability, right?

Harshal Pimpalkhute: Yeah.

Patrick Moorhead: And doing it before with either analytics, machine learning or deep learning techniques were also very hard as well. So I’m hopeful you built an architecture that because people, it’s just human nature, they’re going to want more bells and whistles as we go in the future. And quite frankly, if I look back at the history of AWS, you started off with one VM size and then it’s like, “Everybody, hey, same networking, same memory.” But everybody wants to be able to customize for their uniqueness. So yeah, I’m excited. I don’t normally get excited about this stuff, Dan.

Daniel Newman: Yeah, I don’t think that’s really the case. You normally do get excited, that’s why … I’m joking. That’s why we’re here, right? It’s why we do this.

Patrick Moorhead: It’s the customization part of it that I’ve been, I keep asking and asking and asking, and I feel like I’ve seen a lot of people talk, that talk about it, this is the first time and probably the best articulation I’ve heard of so far.

Daniel Newman: I’ll reiterate this and I’ve said this on other shows. I really came into this event knowing that this was a very critical inflection point for AWS. The world, it was sort of watching and what you’ve really come out is declaratively saying, “Look, the combination of our massive infrastructure users, workloads, data, plus our really open generative AI stack, we’re not late. We didn’t miss anything. We understand what the enterprise needs and we’re here to basically support the build out of the future.” And I think that that was answered, and I think that’s what I as an analyst wanted to hear this week.

Harshal Pimpalkhute: Yeah, absolutely. I couldn’t agree more. At AWS and at Amazon we work backwards from customer requirements and we’ve proved that we can do that again with Bedrock. And if you look at all the announcements, they are for customization, they’re for choice. Making it easy to integrate with your generative AI applications so that workloads are production ready, for enterprise ready use cases. Guardrails is just one example. So we are looking forward to how customers will be using these in that environment.

Daniel Newman: Being able to turn those knobs, make it just right for your business, but at the same time, fast, streamlined, automated, these are things people want. Harshal, I want to thank you so much for joining us here.

Harshal Pimpalkhute: Thank you.

Daniel Newman: I really appreciate your time.

Harshal Pimpalkhute: Thank you for having me. It’s a pleasure.

Daniel Newman: All right, everybody hit that subscribe button. Join us for all the podcasts and conversations with Patrick and I on The Six Five On The Road here at AWS re:Invent in Las Vegas. For this episode, it’s time to say goodbye. We’ll see y’all really soon.

Patrick Moorhead

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.