Google’s Data Cloud Summit Serves Up A Breadth Of New Capabilities

I mentioned in my preview of the Google Data Cloud Summit this week that I was expecting some exciting technology announcements for AI, machine learning, data management, and analytics. Google did not disappoint in that department. 

The Google Cloud stated mission is to accelerate every organization’s ability to transform through data-powered innovation. A theme that should come as no surprise to anyone, but in Google’s case, supported by a slew of new technologies and innovations. 

I mentioned in my preview of the Google Data Cloud Summit this week that I was expecting some exciting technology announcements for AI, machine learning, data management, and analytics. Google did not disappoint in that department. 

The Google Cloud stated mission is to accelerate every organization’s ability to transform through data-powered innovation. A theme that should come as no surprise to anyone, but in Google’s case, supported by a slew of new technologies and innovations. 

Google Cloud GOOGLE

Vertex AI

In case you missed it, Google announced Vertex AI at Google I/O on May 18th.

A well-known fact in the world of machine learning (ML) is that it takes a lot longer to deploy ML models to production than to develop it. That is assuming you can find the talent to work on the project in the first place.

Over time, Google has learned essential lessons on building, deploying, and maintaining ML models in production. That knowledge has formed the foundation and design of Vertex AI which combines the Google Cloud services for building ML under a single unified UI and API to simplify building, training, and deploying machine learning models at scale.

Vertex is an end to end AI managed service GOOGLE

Vertex AI has every tool required to manage data in a single platform, prototype, experiment, deploy, interpret, and monitor models in production without requiring formal ML training. 

That last point is critical given the scarcity of talent. Your data scientists are not required to be ML engineers. 

It is great to see Google Cloud add this capablity as no one questioned its technical capability, but enterprises were looking for a way to simplify and coordinate ML. And here it is.

Dataplex

Data resides in multiple locations such as data lakes, data warehouses, and other specialized data marts in most organizations. In this scenario, customers need to define and enforce consistent policies across all the data, irrespective of where it physically resides.

Dataplex is an intelligent data fabric that enables centralized management, monitoring, and governance for data across all storage and analytics systems and makes it securely accessible by various analytics and data science tools. Dataplex includes Google Cloud and open-source tools, with built-in Google Artificial Intelligence (AI) and machine learning (ML) capabilities.

Dataplex is another example of functionality that abstracts you from the infrastructure to free up time and effort to focus on the business. 

Datastream 

Everyone wants to integrate and analyze data faster and use fewer system resources to achieve it. Change Data Capture (CDC) identifies changes to sync databases more efficiently. CDC is a requirement to provide replication capabilities across disparate data sources. Standard data replication methods are costly, cumbersome to set up, and require significant management overhead to run at scale.

Datastream is a serverless, cloud-native Change Data Capture (CDC) and replication service. Datastream reads CDC events (inserts, updates, and deletes) from source databases and writes those events with minimal latency to a data destination.

Datastream is a big step up from batch processing employed by many companies. Companies can now synchronize data across heterogeneous databases, storage systems, and applications with minimal latency to support real-time analytics, database replication, and event-driven architectures.

Analytics Hub 

Sharing and exchanging data with other organizations is not easy, especially with security threats and privacy regulations on the rise. But some companies, need shared data to run the business effectively.

Batch data pipelines are expensive and unreliable to run. The result is multiple copies of data which can break data governance processes.

Analytics Hub is a new service built on BigQuery, Google’s petabyte-scale, serverless cloud data warehouse.

BigQuery’s architecture provides separation between compute and storage, enabling data publishers to share data without multiple copies. Data can be provided and consumed in real-time using the streaming capabilities of BigQuery. 

Analytics Hub builds upon BigQuery capabilities to exchange data by introducing shared datasets and exchanges. Shared datasets contain the views of data that you want to deliver to your subscribers. Exchanges ensure shared datasets are organized and secure. By default, Exchanges are entirely private, which means that only the users and groups that you give access to can view or subscribe to the data. You can also create internal exchanges or leverage public exchanges provided by Google. One of the interesting differences from others in the market is that the Analytics Hub will include visualization (vs having to rely on third parties), and can provide customers with access to unique data sets on Google can provide, such as Google Search Trends, as an example. Finally, you publish shared datasets into an Exchange to make them available to subscribers. 

BigQuery Omni and Looker for Microsoft Azure 

BigQuery Omni is a multi-cloud analytics solution that provides access to analyze data across Google Cloud securely, Amazon Web Services (AWS), and now Microsoft Azure.  The BigQuery user interface (UI) is the single pane of glass using standard SQL and BigQuery APIs across data silos. BigQuery Omni, powered by Anthos, allows data queries without having to manage the underlying infrastructure.

Looker is a business intelligence platform that helps you explore, analyze and share real-time business analytics easily. With Looker, you can connect, analyze and visualize data across Google Cloud, AWS, on-premises databases, and with this announcement, Microsoft Azure.  

The money slide

I hate to say it, but I love slide decks. Many people say, “we don’t want to powerpoint you to death” and I respond, “bring on the slides.” You see, I am a visual learner and I see bigger ideas in large and gnarly charts. The chart below says it all about Gooogle Cloud’s promise to provide value to every data user across the entire data lifeycycle. And I think that’s a big deal.

Google offers data services for everybody GOOGLE

Wrapping up 

Google has announced new capabilities that address key customer challenges in handling, sharing, analyzing data. 

Since the beginning, Google has been a data company, and we now see them bringing capabilities developed for internal use to customers. Other cloud providers may have similar capabilities, but I believe Google should have an edge in managing and analyzing data because of the heritage. In my many conversations with CIOs and business leaders, if they are using Google Cloud, it is for “data” and “AI”. This all makes sense, right?  

Google Data Cloud summit is an event not to be missed by anyone using or considering using Google Cloud technologies. But if you did, don’t worry because the content will be available for on-demand viewing immediately following the live broadcast of each event here.

Note: Moor Insights & Strategy writers and editors may have contributed to this article.