Last week AWS came out with a slew of product announcements that should make big waves in the world of AI foundation models (FMs) and generative AI, especially in the enterprise space. Amazon has been working with AI and machine learning (ML) in a serious way for two decades, and I believe this move is far more than a response to recent announcements from competitors about Microsoft Azure and Google Cloud.
Indeed, I think that what Amazon has just announced will be huge for foundational models and other areas of AI. Even better than that, Amazon is now providing a more holistic enterprise AI offering than anything I've seen from any other provider.
Time to get serious about AI for B2B
For starters, let’s set the context by emphasizing that the new AWS announcements are distinctly geared for B2B. I’ve read a few things lately that have tried to discuss the AI market as one homogenous blob; the laziest of these have made it seem like the “AI war” will play out between Microsoft’s AI-powered Bing and Google’s AI-powered Bard. Yes, there are fascinating developments on the consumer side, and my colleagues and I have recently analyzed the battle of the search engines, Microsoft’s AI Copilot and more.
But the B2B side is very different, not to mention more complex, than the consumer side—something that’s reflected in the multiple layers of the new AWS offerings. Those offerings are discussed in detail in this blog post by Swami Sivasubramanian, vice president for database, analytics and ML at AWS.
By the way, kudos to Sivasubramanian and his team for including so much helpful background information to start that piece. It’s a lot to read, but it digs into Amazon’s history of using AI across multiple areas of its own business, the breakneck pace of AI innovation and scale happening right now and Amazon’s philosophy for meeting customer challenges using AI. Honestly, the preamble of that blog post is probably the best concise expression I’ve read of what I think is going on in the industry.
- Amazon Bedrock — a service, currently in limited preview, that uses APIs to deliver best-of-breed FMs from AI21 Labs, Anthropic, Stability AI (makers of Stable Diffusion) and Amazon itself; it helps customers “build and scale generative AI-based applications . . . without having to manage any infrastructure.”
- Amazon Titan FMs — two new large language models (LLMs), one for text and one for vector embeddings (used in search etc.), that are pre-trained using large datasets to serve as general-purpose models, either as-is or after customization with customer data; these are also in preview, with broader availability “in the coming months.”
- ML infrastructure — now in GA, the company is offering Amazon EC2 Trn1n and Amazon EC2 Inf2 instances that are powered, respectively, by AWS Trainium and AWS Inferentia chips; it touts these as enabling “the lowest cost for training models and running inference in the cloud.”
Amazon CodeWhisperer — also in GA, this an AI-driven “coding companion” offered free of charge to individual developers; it handles routine coding tasks across a jaw-dropping array of languages and IDEs to improve developer productivity.
Saha told me that “The thing that really differentiates Bedrock, in my mind, is the choice and flexibility that we give to customers, along with the privacy and security guarantees from AWS.” He also made the good point that these services integrate with everything else on AWS, and come with the “workspace mechanisms that [AWS] customers are used to.” Factually this is correct. Amazon offers more models and more choices in infrastructure to accelerate these workloads.
I think this is a definite strategic advantage when you consider that AWS is by some distance the largest cloud IaaS provider in the game. Amazon is making it very easy for existing enterprises that use AWS and need more AI horsepower to simply light up these new products.
Beyond that, Saha emphasized how this whole set of offerings “will truly democratize access to generative AI,” a theme that was echoed throughout the announcement blog post as well. Bedrock aims to allow customers to employ different AI models for a wide variety of uses without needing any infrastructure. The Titan models provide customers with powerful options right out of the box, while also allowing more tailored approaches that don’t require a lot of customization effort. Together, these should live up to the company’s intention—stated in the blog post—to give customers “a straightforward way to find and access high-performing FMs”; seamless integration into applications without incurring big costs; and an easy way “to take the base FM and build differentiated apps using their own data.”
Using the FMs could be even more economical if customers run them off of the new infrastructure offerings, which AWS says deliver price-for-performance that’s 40% or more better than any other EC2 instance—and “the lowest cost for inference in the cloud.” It’s been five years since Amazon announced its Inferentia chips, which reflects how long-term the company’s thinking has been for AI. From an economic standpoint, I was impressed to learn that in its five-year lifespan to date, Inferentia “has saved companies [including] Amazon over $100 million in capital expense.” Many companies make price-performance claims, but in my experience, Amazon has been the most consistent over time. When the company claims X%, two years later, it is X% better. That is hard to do, but as Amazon owns pricing on all offerings, it can move the price and performance dials up and down as it pleases.
It’s clear that AWS has a big focus on reducing the cost of development. That’s even before we get to CodeWhisperer, which Saha says “really takes the application of generative AI to every software developer.” I expect that hordes of developers will jump at the chance to take care of routine coding tasks using this tool, which seems almost like something from science fiction. And I have no idea how AWS did this, but the thing handles almost every language that I'm aware of and every modern language IDE that's out there.
On top of all that, AWS says that participants in a productivity challenge during the CodeWhisperer preview phase “completed tasks 57% faster, on average, and were 27% more likely to complete them successfully than those who didn’t use CodeWhisperer.” By the way, there’s also a business version of the tool that comes with single sign-on, IAM integration and other enterprise-grade features.
Building better AI across multiple vectors
AWS claims that it provides “the most performant, scalable infrastructure for cost-effective ML training and inference.” I asked Saha to comment on what the best proof of this might be, and in particular whether he focused on measures of scale such as parameter count. In my experience, most claims like this are “throwaway,” not based in fact, or true for a small period of time.
From his response, it’s clear that he takes a holistic view. First, he summarized key aspects of the AWS infrastructure, models, networking, chips and different types of software—with examples of results from customers including Bloomberg and LG. Then he said, “I think it's really a combination of the hardware and the software and the optimizations we do and the customizations we do that make our infrastructure really well suited for LLMs and foundation models. . . . It’s so important for us to be able to provide customers the innovations on all of these vectors.” I will be doing more digging to compare raw performance versus Azure, Google Cloud and Oracle Cloud. But as I said before, I have yet to catch AWS in a claim that wasn’t true on the compute side.
Later in our conversation, he echoed this point in a broader way when he talked about the company’s focus on customers: “When you think about machine learning or generative AI, [it] doesn't happen in isolation, it requires that foundation of compute and storage and security and privacy and governance and all of that. . . . Ultimately people want to do generative AI, but they want it inside an enterprise environment that caters to enterprise needs.”
He again cited Amazon’s “very rich, deep history of machine learning,” adding that “I think that deep heritage gives us a deep expertise.” Combined with how closely Amazon works with customers—and the sheer size of its customer base—Saha believes this legacy “gives us a unique insight into what are the real customer pain points.” AWS does, in fact, host more IaaS ML workloads than any other company and has been doing AI for a long time.
Making “customer-centric” into a reality
Hearing this from Saha reminded me that years ago, when I started hearing leaders at AWS talk about being “customer-centric,” I was dubious. Every company says that they focus on customers. But after following AWS closely for almost a decade, I now believe that its people really walk the talk.
The tricky part is getting to know your customers so well that you’re not only giving them what they want now, but what they’re going to want a year or two down the road, which is usually how long it takes for even the most aggressive product teams to bring something new to market. So it’s impressive to me that AWS has found a way to bridge that gap and step into the middle of the “AI wars” with such a sophisticated and well-considered product set. Net-net the best companies meld the “art of the possible” with “future customer needs.”
Bedrock, the Titan models, the new infrastructure pieces and CodeWhisperer should all build seamlessly on top of SageMaker, Amazon’s existing platform for building and training ML models. But to hear Saha tell it, the company is just getting started. “This is truly, truly day one for generative AI,” he said. “There's a lot of innovation [still] to happen.”
I’ve said all along that AI is a marathon, not a sprint. Based on the strong showing of last week’s announcements, I believe that Amazon is geared to compete and win in that marathon.