On October 12th, I was honored to be a keynote speaker at Evolve 2022 in NYC. The event was a who’s who of data leaders and enthusiasts hosted by industry luminaries Cloudera, Intel, and IBM.
Cloudera chose the event to announce enhancements to the Cloudera Data Platform One (CDP One), which I reviewed recently. CDP One recently added an open lake house that enables self-service analytics across the entire data life cycle from ingestion to machine learning on any data without requiring specialized operations. CDP One offers data and analytics to more lines of business users, both expert developers and low code data analysts.
Easy access to all types of data
Most of us have hopefully moved on from the pandemic and are now grappling with new hybrid work models and how to bring employees back to the office. There are multiple market forces at work today, and more than ever, businesses must have faster access to business insights from the high amount of data gathered to remain competitive.
Probably not the first time you have heard that dire prediction, as many vendors use it as the setup for marketing a product or service to you. Unfortunately, the dire prediction is very accurate, leaving you to judge the product’s or service’s merits!
Most of you continue to collect and store vast amounts of data, whether on-premises in multiple clouds or at the edge. Shifting focus from collecting and storing data to using data more effectively and arriving at insights faster should be a top priority.
The key to success is to provide line of business users with easy access to all types of data regardless of how and where it is stored; structured data, unstructured data, real-time data, batch, and streaming. That will provide the capability to rapidly create real-time dashboards, deliver new end-to-end business processes and build new analytics and machine learning use cases in a matter of minutes instead of months or even years.
Essentially a robust solution for the challenges outlined at the outset.
SaaS is the easiest “on-ramp” to the cloud
Regular readers will know I am a big proponent of Software-as-a-service (SaaS) as the easiest “on-ramp” to the cloud. CDP One is a SaaS implementation that includes various services, including data ingestion, governance, preparation, lake house/streaming analytics, and machine learning.
A SaaS implementation enables Cloudera to upgrade existing users with new features seamlessly. Cloudera did that recently with new hybrid data capabilities that make it easier to move data at scale across multiple clouds.
Quickly move data workloads and applications across clouds
In a hybrid world, having a data service that runs identically across different clouds makes it easier for users, administrators, and developers. Users will have the same experience regardless of where the data is stored or where the data applications are running. The same data analytics functions, Cloudera SDX security, and governance will run seamlessly with the cloud-native storage on the preferred cloud.
Portable Data Services avoid the costly redeveloping or rearchitecting of the data applications. The CDP Data Services which include Data Engineering, Data Warehousing, and Machine Learning – are each built on a unified code base and offer identical functionality on AWS, Azure, and on-prem private cloud.
Move data with security and governance policies
When data is needed, it is often required to fetch it from another place. Security and governance policies must move with the data to ensure the data are safe and governed.
Cloudera’s solution to secure data replication is Replication Manager. Replication Manager moves the metadata that carries data security and governance policies with the data, eliminating the need to reimplement them.
The data movement could be from on-premises to the cloud or cloud to cloud in real-time.
The replication manager has a policy-driven interface that includes selecting the source systems and data to replicate, pointing to the target environment, and deciding on the frequency and resources to use.
Planning for disaster recovery also involves moving data. Replication Manager, migrating data with its security and governance policies, enables full backup and disaster recovery. Other use cases include migration from legacy clusters to cloud deployments, hybrid cloud flexibility through continuous synchronization, and the creation of development and test systems.
A data ingestion solution built for a hybrid data world
Universal Data Distribution enables control of data flows, from origination through all points of consumption, both on-premises and in the cloud, in a simple, secure, scalable, and cost-effective way.
Cloudera DataFlow enables universal data distribution, which Cloudera claims is the first data ingestion solution built for a hybrid data world. DataFlow is a hybrid data ingestion solution that addresses the entire diversity of data movement use cases: batch, event-driven, edge, microservices, and continuous/streaming. DataFlow is a cloud-native Apache NiFi service available in CDP One.
DataFlow eliminates ingestion silos by allowing developers to connect to any data source with any structure, process it, and deliver anywhere using low-code development. DataFlow comes with over 450+ connectors and processors across the ecosystem of hybrid cloud services—including data lakes, lake houses, cloud warehouses, and on-premises sources. Data distribution flows can be version-controlled into a catalog where operators can self-serve deployments to different runtimes.
I believe Cloudera is one of the world’s top enterprise data platform companies.. I base that conclusion on the fact that, in my estimation, it is the only game in town for end-to-end data management across hybrid, multi-cloud, and on-premises.
CDP One goes beyond simply gathering and storing data but enables quick and easy data analysis. Moreover, it tracks and secures this data across multiple different environments.
As mentioned earlier, as data grows exponentially, you need the tools to enable rapid business transformation in an increasingly hybrid and multi-cloud environment.
Cloudera has a proven track record for moving data and workloads throughout a modern data architecture to meet evolving business requirements. Cloudera should be on your list if you want to leverage AI, ML, and hybrid architectures to drive your businesses forward.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.