IBM Expands Watsonx Platform With Watsonx.data

By Robert Kramer, Patrick Moorhead - October 4, 2023

When I began my career as an ERP marketing specialist, I delved deep into the intricacies of the IBM RS/6000, a Unix-driven system introduced in 1990. The RS/6000 was a versatile parallel system, pioneering in its support of both technical computing and commercial applications such as databases, transaction processing and multimedia servers. It was mostly an on-premises server, and it stayed in production until 2017.

Fast forward to this year, when IBM released its innovative watsonx platform, which it hopes will mark a transformative moment in the AI and data world. The cloud-centric AI and data platform with the lowercase name enables businesses to build, roll out and manage AI applications confidently.

The history of Watson at IBM

If you’ve ever wondered about the evolution of IBM’s Watson lineup, here’s a brief overview. IBM unveiled the Watson supercomputer in 2011, merging AI and analytical capabilities. The machine was named after IBM’s pioneering leader, Thomas J. Watson. The question-answering supercomputer was built with state-of-the-art natural language processing and machine learning capabilities. It was initially designed for the Jeopardy game show, and in 2011 Watson competed against two human champions, Brad Rutter and Ken Jennings, and won the game with a prize of $1 million.

Over the years, Watson’s applications have expanded; today, organizations across many industries harness its power. In healthcare, it assists clinicians with diagnoses and personalized treatments. In finance, it defends against fraud, streamlines risk assessments and improves customer interactions. Businesses in multiple sectors use Watson to transform customer service, automate processes and provide tailored support. Legal professionals turn to Watson for legal research, contract reviews and efficient document drafting. Meanwhile, manufacturers employ Watson to enhance quality control, fine-tune production techniques and boost cost-effectiveness.

The new dawn of Watson

The watsonx platform offers three distinct solutions tailored to a range of client requirements. First, watsonx.ai serves as a hub for AI professionals, supporting every phase of AI development from training to deployment for both established and cutting-edge AI methodologies. Second, watsonx.data is an open lakehouse storage solution optimized for AI projects, ensuring seamless data transfer and management. Finally, watsonx.governance enhances AI governance by promoting trust, minimizing human intervention risks and ensuring clear accountability and transparency in AI processes.

The new evolution of watsonx.data

IBM launched watsonx in May 2023 at the IBM Think conference; the company has recently announced two significant updates to watsonx.data planned for release in the last quarter of 2023. The first integrates generative AI capabilities from watsonx.ai, enabling users to access data AI tasks through an intuitive user interface. Second, watsonx.data adds vector database features to reinforce watsonx.ai’s data retrieval functions. watsonx.data’s open lakehouse design augments analytics and AI for businesses, enables better data access, ensures reliable insights and decreases data warehouse expenses. This service is now available on IBM Cloud, AWS and as containerized software. I’m looking for this service to be made available on other cloud providers like Google Cloud, Oracle and Microsoft Azure.

watsonx.data improves on first-generation data lakehouses
IBM

The image above illustrates the enhancements IBM watsonx.data brings to current first-generation data lakehouses. watsonx.data optimizes costs and performance by pairing the right workload with the right engine. You can run all workloads from a single pane of glass, eliminating the trade-offs between convenience and performance. IBM watsonx.data can be deployed anywhere, including hybrid and multi-cloud environments. Shared metadata across multiple engines eliminates the need to re-catalog, accelerating time to value while ensuring strong governance and eliminating costly implementation efforts.

watsonx.data functions and uses

watsonx.data provides a centralized gateway to access all data thanks to a unified metadata layer that spans both cloud and on-premises setups. Users can link to pre-existing analytics stores within hybrid infrastructures with minimal effort and commence data exploration and modification using standard SQL. Additionally, watsonx.data integrates with multiple object stores like AWS S3 and IBM Cloud, as well as popular databases such as MongoDB, MySQL and PostgreSQL.

Its architecture enables watsonx.data to provide diverse solutions for different users. It allows data scientists to efficiently deploy AI/ML models, with an emphasis on data governance and reproducibility. Data analysts can quickly merge data from different sources, streamlining analytics and BI without the complexities of manual data movement. Data engineers benefit from simplified pipelines and transformations using tools like SQL, Python or AI interfaces. Moreover, watsonx.data promotes responsible data sharing, balancing broad access with strict security and compliance protocols.

In line with the broader history of Watson products, watsonx.data is a flexible platform for use across various industries. It can analyze trends, detect fraud and optimize portfolios in financial services, and for retailers it can enhance customer experiences, tailor marketing efforts and forecast inventory needs. In healthcare, it helps identify diseases, optimize treatments and reduce costs. Manufacturing firms can leverage it to boost efficiency, streamline production and minimize waste by monitoring equipment and refining supply chains.

Competition

IBM watsonx.data faces plenty of competition in the data and analytics domain. Amazon Redshift, a managed cloud data warehouse, stands out for its robust performance and scalability, supporting organizations that need extensive data storage and analytics. Google BigQuery, with its serverless cloud architecture, promises swift, large-scale data analysis. Microsoft Azure Synapse Analytics excels in processing vast data quantities and interpreting varied data sources. Snowflake Data Warehouse offers a cloud solution known for its performance, scalability and adaptability in handling various data formats. SAP HANA Cloud provides a platform for real-time data analysis, emphasizing its in-memory capabilities. Oracle’s Autonomous Data Warehouse has an established history for its security, scalability and performance in a cloud-native environment. Databricks Lakehouse combines the flexibility and scalability of data lakes with the security and governance of data warehouses, while Azure Databricks champions seamless data pipeline creation and machine learning model management. Amazon SageMaker streamlines machine learning model processes from building to deployment. Finding the best fit for your organization among watsonx.data and these contenders hinges on your specific business requirements—from data volume and performance criteria to budget constraints—and desired features.

Summary

IBM’s watsonx.data has emerged as a robust data lakehouse platform with an array of features that businesses will want to put to use right away. Its most notable offerings are versatile query engines, an integrated dashboard, seamless deployments in hybrid and multi-cloud landscapes, unified metadata and strong governance frameworks. Serving a spectrum of industries, including finance, retail, healthcare and manufacturing, this platform simplifies data processes and analytics. For those in pursuit of a top-tier lakehouse solution, IBM watsonx.data presents itself as an excellent choice. Observing IBM’s evolution from the RS/6000 days, through the genesis of Watson, to the recent watsonx rebranding has been intriguing—and a good reminder of IBM’s profound impact on today’s data-driven era.

Robert Kramer
VP & Principal AnalystatMoor Insights & Strategy| + posts

Robert Kramer is vice president and principal analyst covering enterprise data, including data management, databases, data lakes, data observability, data analytics, and data protection. Robert has over 30 years of proven experience with startups, IT companies, global marketing, detailed strategies, business modeling, and planning, working with enterprise companies, GTM assets, management, and execution.

Patrick Moorhead
+ posts

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.