Reframing Kubernetes Observability with a Graph

Originally posted on thenewstack.

It’s the best way to increase efficiency and improve processes, even when dealing with the complexity of Kubernetes and other cloud-first technology.

Kubernetes makes it possible to deploy applications across multiple hosts while letting teams manage them as a single logical unit. It abstracts the underlying infrastructure and provides a uniform API for interacting with clusters along with automation for simplified workflows. It’s the perfect system for modern development practices.

But ensuring efficiency and resilience in these cloud-first ecosystems is not easy. Microservice architectures make it impossible to keep pace with all the software and infrastructure changes happening constantly. This issue is only made worse with disjointed monitoring and observability tools, not to mention siloed information across teams and individuals.

To keep up, organizations must think about DevOps and Kubernetes in a new way — as a graph.

DevOps as a Graph

DevOps often focuses on automation and integration without considering the relationships and dependencies between underlying tools and processes. On the other hand, thinking of DevOps as a graph places greater emphasis on those connections to provide better context that leads to more effective action. Where traditional DevOps methodologies often rely on linear, sequential workflows, thinking about DevOps as a graph helps organizations adopt a more holistic, systems-based approach.

By modeling the different components of the DevOps pipeline as nodes in a graph, organizations can gain a better understanding of how different components interact and how changes in one area can affect the entire system. This can help organizations take a more proactive, strategic approach to DevOps, rather than simply reacting to issues as they arise.

Thinking about DevOps in this way requires both a shift in mindset and practice from a tool-centric approach to a more system-based approach.

It’s not easy but ultimately makes teams and organizations more data-driven and proactive.

Graph and Kubernetes

In a Kubernetes-based DevOps pipeline, there are many components that can be modeled and analyzed using a graph-based approach. For example, the relationships between different containers, services and pods in a Kubernetes cluster can be represented as nodes in a graph, and the interactions between them can be represented as edges. By analyzing this graph, organizations can gain insights into the performance of their Kubernetes-based DevOps pipeline, including identifying bottlenecks, troubleshooting and optimizing workflows.

More specifically, a graph approach applied to Kubernetes deployments allows for:

Knowledge Capture and Retention

Through the visualization of different components in a Kubernetes deployment as a graph, organizations can gain a better understanding of how different components interact and how changes in one area may affect others. For instance, it can show that a particular service is heavily relied upon by other components. Or when dealing with external resources like Amazon Relational Database Service (RDS) or DynamoDB, organizations can notice which pod relies on which database for a clear picture of dependencies and risk.

Here’s how this is implemented in practice:

In the DevOps observability platform, we set up a scene for Kubernetes/Amazon Elastic Kubernetes Service (EKS). A scene provides a topological view of what the Kubernetes architecture looks like. In this case, we created a simple Kubernetes infrastructure map that automatically discovers and visualizes all Kubernetes dependencies to help keep track of changes. The dependency map includes resources from clusters and services to pods, containers and processes.

A scene can incorporate metric indicators, allow users to learn the relationships between pods, nodes and namespace and the metrics tied to them. If the selected entity has any associated metrics, then they’ll appear under the metrics tab within the Context Menu.

Clicking on any of the listed metrics will generate a chart window for it.

You can then add the chart to a dashboard to help identify issues and establish root causes for problems.

Optimization and Troubleshooting

Organizations can also identify bottlenecks and optimize the flow of work by analyzing the graph of their Kubernetes deployment. If a graph-based analysis shows that a particular pod is frequently causing timeouts or errors, organizations can investigate the cause and take steps to remediate it. Better yet, teams can see a timeline of related changes leading up to abnormal behavior and reveal the root cause in real time. Where once troubleshooting would take too long as teams failed to be aware of all the continuous changes in their environment, they can now connect cause and effect in a shared, context-driven space.

Resource Allocation

Optimizing resource allocation also becomes simpler. Through analysis of the relationships between components and requirements, organizations can identify opportunities to optimize resource usage and reduce costs. For example, a graph-based analysis could show that a particular pod is over-provisioned and needs to be scaled down, whereas without it, it might be hard to tell which discrete aspect of the deployment is the problem.

Better DevOps

Ultimately, there are undeniable benefits to viewing DevOps as a graph. It provides a better, granular understanding of complex systems through relationship and workflow mapping. It improves visualization of discrete components for quick identification and remediation of issues across the environment.

Decision-making can be bolstered with data-driven insights that emerge from the patterns and relationships revealed in the graph. Simply put, it is the best way to increase efficiency and continue improvement of DevOps processes, even when dealing with the complexity of Kubernetes and other cloud-first technology.

To reach this level of operationally efficiency, DevOps teams need to employ tools that offer a dependency graph that is unified or easily connected with a change timeline. It’s important that teams have a comprehensive view of all the disparate components in their deployments while also being able to note all their relationships and dependencies.

This can’t be done manually or on a case-by-case basis due to long-standing issues with knowledge gaps and complexity when it comes to cloud-first architectures like Kubernetes. Only modern change intelligence tools that can generate a real-time, accurate topology view of your architecture while adding context through relevant metrics can do so effectively.

Imagine all the time and headaches teams will save by not having to manually construct a graph, maintain it and manage the changes. Fortunately, you won’t have to imagine for long.

Source: thenewstack