Pure Storage’s New ‘Data Hub’ Storage Architecture Looks Compelling

By Patrick Moorhead - September 12, 2018
Pure's proposed "data hub architecture".  PURE STORAGE

At Moor Insights & Strategy we talk a whole lot about the changing nature of data. Analyst Steve McDowell and I have written a lot recently about the expanding scope of data,  how every enterprise needs a data strategy,  and have even discussed the impact of data on storage.  We talk about data because, at the end of the day, data is now a competitive differentiator. The "data-driven" enterprise is real.  I believe success in this business world requires thinking about how data is captured, consumed, and stored differently.

It’s not just the nature of data that is changing.  Data center architecture is also rapidly evolving.  The rise of virtualization, convergence, and composable infrastructure from vendors such as Hewlett Packard Enterprise, Dell Technologies, Lenovo, and IBM are delivering unprecedented levels of flexibility to IT organizations.  Workloads can now be run on any combinations of infrastructure, whether on-premise, on the edge, in the cloud, or somewhere in-between, all managed from a single pane of glass. These technologies make it easy to compose and deploy workloads across the organization.  This flexibility challenges traditional thinking about storage, where data is often partitioned into application-specific silos. Data silos don't scale in the new world of data, where a myriad of applications consumes common sets of information. It may be time to rethink how data is stored and served. Pure Storage thinks so. The company released an open letter to the industry this morning saying as much. Changing world of data Central to the changing nature of data is the introduction of advanced analytics into enterprise workflows.  Data that is captured and processed by traditional business applications are now consumed by deep learning systems, where patterns are captured and turned into insights. AI-driven analytics may guide an organization’s logistics systems, detect fraud patterns, derive customer purchasing patterns, or even intelligently staff retail locations. There are dozens of applications, with more emerging daily. The nature of analytics demand data be available as a consistent whole, not fractured in data silos, and be delivered to the compute engines with high-speed and low-latency. As analytics becomes a driving force, data lakes will inevitably replace some data silos. Data lakes allow an enterprise-wide view of the totality of its structured and unstructured data. While data lakes enable any application to access underlying data cohesively and consistently, the technology has held back enterprises in practice because the underlying architecture is not built to deliver data in real-time and in a multi-dimensional way. We published a whitepaper today going in-depth on the challenges that analytics and data silos place on enterprise storage systems.  You can read that here. The "Data Hub" Pure Storage Inc. this morning released an open letter to the industryboldly putting a stake into the ground, declaring that the old way of thinking about storage doesn’t scale in the emerging world of the analytics-driven data center.  They call their new approach the “data hub.” The data hub delivers on many promises to fulfill the vision of a data-driven enterprise. It offers high throughput file and object storage while providing application-specific performance characteristics (e.g., latency, throughput, and input/output operations per second). It delivers application-specific capabilities to multiple clients simultaneously. The data hub is a reimagining of the storage system. Central to the data hub vision is that storage becomes multi-dimensional. The data hub offers multiple parallel compute and storage elements that can be individually partitioned and tuned for various workloads that are attached. Software orchestration becomes critical beyond the underlying hardware configuration. The data hub is designed to alleviate the challenges imposed by traditional architectures. It enables an organization’s data to be delivered to multiple applications, matching the needs of those applications. FlashBlade is the first Data Hub iteration Pure Storage understands AI-driven analytics better than most storage companies. I wrote about Pure’s collaboration with NVIDIA in designing the AIRI reference architecture for machine learning and AI earlier this year. Central to AIRI is the Pure Storage FlashBlade.  Pure’s FlashBlade is a different kind of storage machine. It is designed not as a traditional controller-based system but instead contains many high-performance storage blades. Each storage blade is capable of delivering data with the capabilities demanded by the attached client.  Scalability becomes a matter of adding blades. It’s clear that FlashBlade is designed with the data hub concept in mind.  It provides the foundational elements required to deliver on the vision they articulate in their open letter. Wrapping up Pure Storage has always been an innovator. It was one of the first to market with the all-flash array, pushing its larger competitors to compete. Pure has done the same thing with the adoption of NVMe and high-performance storage.  FlashBlade and AIRI brought Pure into the world of advanced analytics. Pure clearly understands the challenges that arise when bringing analytics into the enterprise. Pure has turned its learnings from delivering storage to deep learning and AI environments with AIRI into an understanding of how to leverage those technologies into the enterprise. There is no question that Pure Storage is right in saying that the new data center requires a new way of storing and delivering data.  The data hub is an enticing approach.  Whether it’s the right approach will emerge over time.  I’m excited to see a technology company talking not about the usual speeds and feeds, but instead having a conversation about the way that IT delivers on the value of its data.  Every technology company should participate in this conversation. Note: Storage analyst Steve McDowell contributed significantly to this blog.
+ posts
Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.