Why is observability critical? Digital transformation is exploding, and cloud deployments are growing. Last quarter, the big three cloud providers, AWS, Google, and Microsoft Azure, reported $40 billion in revenues, a growth of 36% per year – $160 billion annualized revenue.
That is not always wise cloud spending. It is not just an absolute explosion of data but also an increased complexity, especially when you add multi-cloud and hybrid clouds. The current way of analyzing that data to ensure that your systems and software are working well has not changed — network operation centers and IT ops centers with many people and dashboards. With more data, software, infrastructure, applications, and end users, the triage process becomes that much harder. Organizations trying to manage that environment manually will inevitably run into substantial operational problems.
That dire prediction came during a recent conversation with Rick McConnell, Dynatrace CEO. Now, Rick has software that purports to avert the doomsday scenario, but I agree with him. In this article, we discuss why observability has become so important.
Marching toward zero-day response
Dynatrace has evolved from application monitoring to full-stack observability that delivers answers and intelligent information. IT, security, development and (increasingly) business operations team can triage faster and have better situational awareness, leading to automated remediation. The next step on the journey is building the logic and intelligence into code through an API, driving toward a zero-day response where software can be self-correcting—an exciting vision.
Dynatrace is a powerful end-to-end observability tool
What is the solution if the traditional monitoring approach of watching dashboards, responding to alerts, and manually analyzing data sets doesn’t work in complex cloud environments?
The view at Dynatrace is that the only way to prevent the collapse of the cloud is by deploying Artificial Intelligence for IT Operations (AIOps) capabilities to deliver precise answers as to where issues are happening and situational awareness of how the ecosystem is running.
Four capabilities enable the Dynatrace software intelligence platform. Dynatrace OneAgent collects metrics throughout every tier of the application stack automatically Based on findings, OneAgent automatically activates instrumentation specifically for the stack and auto-injects tags into web application pages.
As OneAgent discovers all the components and dependencies in the application environment, Smartscape technology creates an interactive map of how everything is interconnected.
PurePath is a tracing and code-level analysis technology that automatically captures and analyzes transactions end-to-end across every tier of the application stack, from the browser down to the code and database level. Davis is the Artificial Intelligence (AI) engine that drives automation.
Understanding how these capabilities work together emphasizes the power of an integrated platform.
OneAgent and Smartscape create a vertical and horizontal topology capturing real-time data, allowing Davis to understand every relationship between entities at every moment. Davis automatically establishes and adjusts baselines for performance across the entire ecosystem. When problems arise, Davis detects them instantly with context on what went wrong.
Davis knows the relationships and dependencies from the data center and hosts, through containers and services to cloud services and mobile applications.
PurePath captures and analyzes every transaction’s traces from browsers, mobile, and web services across every tier of the application stack.
Davis understands applications and services to the code level. When an entity has a problem, Davis understands critical metrics, such as users affected or business impact, and can react accordingly.
Davis locates the precise root cause with context. Operators will know if the problem results from a bottleneck of deployment change and can understand the impace of a server-side problem on actual end-user experience.
In addition, Davis quantifies the business impact and prioritizes problems based on impact – e.g., problems that impact a significant number of users. This enables “BizDevSecOps” teams to focus on what matters.
Extending the platform to application security
Dynatrace Application Security is an example of how to add more value to a successfully integrated platform.
Runtime Application Self-Protection (RASP) is an emerging security technology that has been added to the Dynatrace OneAgent extending the Dynatrace platform to application security.
Dynatrace combines RASP and observability for automatic and continuous analysis of applications, libraries, and code runtime in production and pre-production to detect, assess and manage vulnerabilities. Davis continuously watches entire production and pre-production environments to identify changes and provide answers about the source, nature, and severity of any vulnerabilities arising in real time. Dynatrace will detect what is running and pinpoint vulnerabilities instantly.
The traditional monitoring approach was adequate when we were stove-piped in a mainframe, on-premises Windows, UNIX, or Linux servers with homegrown applications.
Today, data resides in hybrid-cloud, multi-cloud, and often with servers at the edge. Additionally, our applications are API based, which introduces myriad ways something can go wrong.
In short, if you deploy in the cloud, you need end-to-end observability with AIOps.
I advise the C-suite, and the notion of observability is not on the radar at that level. The lack of observability awareness will ultimately negatively affect the customer experience. The industry must do a better job of elevating the conversation in the C-suite.
As mission-critical applications get refactored or replaced in the cloud, observability will increase in importance. A solution is armies of people or multiple observability tools in the same way people use different security tools. But, at some point, the complexity catches up, and an end-to-end package is the solution. Point products might be better for doing a specific task, but the time it takes to integrate best-in-breed parts is outweighed by the time-to-market of a complete integrated package. The security industry is heading in that direction, and I believe the observability industry will follow the same model.
True end-to-end observability embraces various data types with traces, scans, metrics, logs, behavioral analytics, and user metadata. By looking at only one of those data types, logs, you have the hammer searching for the nail problem. Security breaches are not easy to find if at all, in logs. Log4Shell is a good example of this and a use case showcasing how Dynatrace works. Whereas other solutions/approaches required days of work with imprecise results, Dynatrace identified all instances of Log4Shell in highly distributed hybrid and multicloud environments in minutes. Additional details here. What is needed is the ability to cross-correlate data of logs and system metrics altogether, giving the view and analytics of what happened. That end-to-end observability process through an automated AIOps engine is the secret sauce at Dynatrace.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.