RESEARCH NOTE: Salesforce Einstein 1 Accesses Unstructured Business Data

By Robert Kramer, Patrick Moorhead - December 29, 2023
Free illustration from Pixabay and Salesforce

Salesforce is taking a significant step in focusing on unstructured data, as I’ll explore in this article. Before getting into the specifics, it’s good to review the difference between structured and unstructured data. Structured data is the precise and often quantitative data found in tools like spreadsheets and CRMs; it includes information such as names, dates, payment amounts, postal codes, or selections from dropdown lists. On the other hand, unstructured data is more qualitative and includes things like freeform comments in feedback boxes, videos posted to social media, or tweets criticizing a company’s service.

Business decisions often rely on structured data because it is straightforward and easy to analyze, for example when we use revenue figures and location data in trend graphs or sales territory maps. However, deriving insights from unstructured data is more challenging, even though it constitutes the majority of data held by companies. This is what makes Salesforce’s ability to access and analyze unstructured data so significant, as it opens up new opportunities for understanding and decision-making in businesses.

I’ve recently covered Salesforce in an article focusing on its new Einstein 1 Platform. This platform merges internal and external business data for use by Einstein Copilot, Salesforce’s generative AI assistant that is integrated across all its applications. A key component of the Einstein 1 Platform is the Data Cloud, which centralizes customer data, streamlining access and analysis.

What’s New for Salesforce

Salesforce has unveiled a new development, the Data Cloud Vector Database, which combines unstructured data formats with traditional CRM data for use within the Einstein 1 platform. This innovation allows Einstein Copilot Search, which is equipped with AI capabilities, to understand and respond to complex queries using a variety of unstructured data sources. By harnessing unstructured data, Salesforce applications can improve workflows, analytics, and automation processes. Thanks to Einstein Copilot Search, users can take advantage of advanced AI functionalities through precise, conversational answers drawn from the Data Cloud.

“This advancement in Data Cloud, coupled with the power of LLMs, is a game-changer, fostering a data-driven ecosystem where AI, CRM, automation, Einstein Copilot, and analytics turn data into actionable intelligence and drive innovation,” said Rahul Auradkar, EVP and general manager for Data Cloud and the Einstein Platform at Salesforce. Salesforce’s Einstein Trust Layer also ensures that AI-generated content is reliable while upholding strict standards of data governance and security.

Einstein Copilot Search works within the Retrieval Augmented Generation framework to make AI more trusted and relevant.

The effectiveness of Copilot is largely due to its Retrieval Augmented Generation (RAG) framework. This technology is particularly useful for Salesforce because it allows the platform to gather data from a wide variety of sources while maintaining data quality. The RAG framework works by first retrieving data from these sources. It then enhances this data by adding relevant context and information. After this augmentation process, it produces an output, which is improved by incorporating the latest developments in LLMs. This method ensures that outputs are accurate and relevant, using up-to-date data. It improves the search function in Copilot, helping Salesforce users find more precise and context-specific results.

Benefits of Using Unstructured Data

The way the Data Cloud Vector Database incorporates unstructured data allows that data to be leveraged to improve various business functions. For example, it can improve customer support by analyzing call recordings to identify common issues, subjects, or customer concerns. Marketing teams can use data from social media posts to better understand customer preferences, enabling Einstein 1 to suggest more relevant content. This data can also help create targeted email campaigns by analyzing customer sentiment and behavior patterns, as well as recommending products or services based on customers’ purchase history and browsing habits.

The system can also enhance product development and innovation by examining customer reviews and feedback. This process involves analyzing unstructured data such as textual feedback from forms, social media posts, emails, support tickets, online forum discussions, and visual content. By doing so, Einstein 1 can identify the strengths and weaknesses of products, helping prioritize product features according to customer needs and feedback.

From a sales perspective, Salesforce’s predictive analytics tools can forecast customer churn by examining patterns in emails, calls, and social media interactions, with dashboards displaying churn risk scores. These tools also predict sales opportunities using customer interaction data, plus they identify risks and opportunities through analysis of industry news and social media trends. Considering this range of examples, it’s clear how having access to unstructured data—with the tools to make the most of it—can improve business performance.


Salesforce’s integration of unstructured data, including social media content, emails, and customer feedback, offers both benefits and challenges. The primary advantage is that it allows for the more effective utilization of large quantities of unstructured data that many organizations have but may not fully exploit. With this expanded data pool, you gain significantly improved customer insights. This deeper dive into customer behavior and preferences enables the creation of more tailored and personalized experiences. In short, incorporating unstructured data can uncover hidden patterns, trends, and customer needs that structured data alone might not reveal, leading to better-informed business decisions and strategies.

However, there are drawbacks. Unstructured data can be inconsistent and messy, raising issues about data quality and relevance. The integration and analysis of such data can add complexity to the Salesforce platform. Privacy concerns are significant, too, especially when using data from public forums, necessitating strict adherence to data protection laws. Handling this kind of data requires increased computational power and storage, resulting in higher expenses. While this primarily impacts Salesforce, it also affects your bandwidth usage, potentially influencing your costs. Lastly, the sheer volume of unstructured data risks information overload, making it challenging to distill meaningful insights. Although the capabilities of Copilot are meant to address this challenge, it’s yet to be seen whether the tool will effectively meet this need, so this is something to monitor.

While Salesforce’s use of unstructured data offers substantial benefits for customer insights, it also introduces complexities in data management, privacy, and resource allocation. The key challenge lies in balancing these benefits with the potential risks and complexities.

The Data Cloud Vector Database and Einstein Copilot Search will be available for pilot testing in February 2024, with Einstein Copilot set to become generally available in the same month.

Robert Kramer
VP & Principal AnalystatMoor Insights & Strategy| + posts

Robert Kramer is vice president and principal analyst covering enterprise data, including data management, databases, data lakes, data observability, data analytics, and data protection. Robert has over 30 years of proven experience with startups, IT companies, global marketing, detailed strategies, business modeling, and planning, working with enterprise companies, GTM assets, management, and execution.

Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.