AWS Turbocharges Foundation Models With Smart AI Agents

By Paul Smith-Goodson, Patrick Moorhead - August 24, 2023

AWS recently announced an important AI capability called agents that adds key functionality to its foundation models. Before discussing agents in detail, I will provide another overview of the Amazon Bedrock foundation models (FMs) that will use these new features. Patrick Moorhead, founder and Chief Analyst of Moor Insights and Strategy, provided a first look at AWS Bedrock in an earlier Fobes article.

Overview of Bedrock foundation models

Amazon Bedrock uses API calls to facilitate integrating foundation models into applications. Bedrock’s API service also allows developers to build applications without the need to manage AI infrastructure at all.

Training foundation models is a time-consuming process. You can train small models with just a few GPUs and about 100 million parameters in a few days. On the other hand, most models used in the corporate environment are large language models (LLMs) running on thousands of GPUs that can take months to train.

Using Bedrock, LLMs like Codex and GPT-4 can generate text, answer questions and summarize content without the need to build and train them from scratch.

Bedrock foundation models are already optimized and pretrained for conversations and content creation. And because they run on an AWS-managed infrastructure, the models can be scaled on services like EC2 and Lambda and provide low-latency endpoints to enable real-time integration into workflows.

Bedrock models can also be customized by using parameters in the API with features including caching, access controls and usage monitoring.

Bedrock offers a choice of foundation models


Forbes Daily: Get our best stories, exclusive reporting and essential analysis of the day’s news in your inbox every weekday.

The two Amazon Titan foundation models, Titan Text and Titan Embeddings, are pretrained on large datasets. Both models can be used as-is or customized for a particular task using proprietary data, which avoids the expense and time that developers would otherwise need to annotate the model with large volumes of data.

  • Titan Text is a large language model used for natural language processing (NLP) tasks such as summarization, text generation, classification, QA and information extraction from text.
  • Titan Embeddings is more complicated than Titan Text. Instead of generating text, Embeddings encodes words, phrases or blocks of text into high-dimensional numeric vectors used for semantic search, recommendations, sentiment analysis and other tasks. The model finds relationships between conceptual meanings rather than between keywords.

Jurassic-2, created by AI21 Studio, is a multilingual LLM for generating text in Spanish, French, German, Portuguese, Italian and Dutch. The model is available in three sizes: Large, Grande and Jumbo, alongside instruction-tuned language models for Jumbo and Grande. Jurrassic-2 also offers zero-shot instruction capabilities, which means developers can steer the model with natural language prompts without using examples.

Claude-2 is an LLM designed for dialogue, content creation, complex reasoning, creativity and coding. Anthropic based it on Constitutional AI and safe training. Claude-2 input can accommodate 100,000 tokens, equivalent to about 75,000 words. That makes it possible to input hundreds of pages of information for analysis.

Stable Diffusion is a text-to-image open-source model created by and trained on a dataset with 5 billion image-text pairs. This model can create realistic, high-quality images of different styles and content by using a text prompt.

Command and Embed, two of the newest foundation models available on AWS Bedrock, were created by Cohere.

  • Cohere Command is a large natural language text generation model focused on task-oriented dialogue. It creates summarization, copywriting, dialogue extraction and questions and answers. Its use cases include database queries, forms and website navigation.
  • Cohere Embed generates vector embeddings that represent the semantic meaning of text. It encodes words, sentences and whole documents into high-dimensional numeric vectors. It is useful for semantic search, recommendations and personalization based on meaning.

Agents for Amazon Bedrock


Foundation models are powerful, but LLMs cannot execute any tasks without proper resources. For that reason, AWS created intelligent agents to manage and perform complex tasks associated with foundation models.

When using agents, developers need only to provide high-level goals or natural language instructions to the foundation model. The agent handles interpreting those instructions, orchestrating the steps involved, integrating the instructions with various systems and providing the right prompts to the model.

Agents allow developers to complete tasks and workflows simply. Natural language instructions direct agents to automate workflows, gather specified information, monitor systems for events or fill out web forms. The agent maps the developer’s natural language instructions to the specific actions and workflows needed to complete the goal.

Even though agents can interpret user requests, carry on a conversation and break down complex tasks into simple steps, developers must still provide the right prompt engineering to ensure that the agent has the right prompts and instructions to work with.

Agents leverage AWS’s cloud infrastructure to scale and maintain real-time responsiveness to users. The cloud also connects agents to external data sources such as OpenSearch or other databases to retrieve the latest contextual information. By combining cloud hosting with prompt augmentation from live data, agents can provide low-latency responses to users even during high-volume usage.

Wrapping up

According to Amazon, thousands of customers are now using Amazon Bedrock for various generative AI applications such as self-service, customer care, text creation and post-call analysis. Having a choice of several models is significant because no one model can satisfy every use case within an enterprise. That’s why AWS offers two foundation models from Amazon plus foundation models from best-in-class AI startups.

How does this work in the real world? In one example, a major insurance firm is using Bedrock to test generative AI applications for analyzing market data and evaluating the impact of AI functionality on employee efficiency. In another instance, a bank is working with the AWS Generative AI Innovation Center to pioneer new use cases based on Bedrock foundation models.

Although machine learning has been used in customer care applications for a decade, and chatbots and AI-powered contact centers have been around for years, the recent surge in generative AI has dramatically improved the versatility of these applications.

Generative AI applications such as conversational search, text summarization and other “copilot” actions have already increased employee productivity. In a more technical setting, code generation is one of the most important productivity improvements for software developers.

Generative AI is still in the early phases of enhancing business operations, but as the technology continues to improve, it will enable even more significant optimizations across organizations.

At the same time that AWS announced agents for BedRock foundation models, it also made other important AI announcements that included AWS Entity Resolution, Amazon EC2 P5 Instances using NVIDIA H100 Tensor Core GPUs, and Generative BI capabilities in Amazon QuickSight. I plan to cover a few of these enhancements in a separate article.

+ posts
Paul Smith-Goodson is the Moor Insights & Strategy Vice President and Principal Analyst for quantum computing and artificial intelligence.  His early interest in quantum began while working on a joint AT&T and Bell Labs project and, during 360 overviews of Murray Hill advanced projects, Peter Shor provided an overview of his ground-breaking research in quantum error correction. 
+ posts
Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.