Troubleshoot and automate to fix your k8s Application

Ever wonder why pods crash infinitely and have to describe, check the events, logs, and figure out the issues?

Ever wonder why pods crash infinitely and have to describe, check the events, logs, and figure out the issues? One issue came up during my work when the readiness probes failed. It takes more time to know the causes and later on fix them later on.

All those solutions came up with a cup of coffee called Robusta. Yes, I am talking about Robusta. Robusta is an open-source platform for Kubernetes troubleshooting and automation. As equally as a Docker, an infrastructure-as-code for deploying applications, similarly Robusta is a troubleshooting-as-code for maintaining K8s applications and handling their alerts.

In this blog, I will be share how you can troubleshoot and automate fixing those K8s Application issues using Robusta. Make sure you have set up a K8s cluster(Here I use EKS) and necessary tools(kubectl, helm, python, and pip).


You need to add a chart repository of robusta and install CLI tools via the python-pip package manager.

helm repo add robusta <> && helm repo update
pip install -U robusta-cli --no-cache

Generate a Robusta configuration so that it can integrate with Slack for when issues occur as well as alerts.

robusta gen-config

Make sure you have created a slack group as we are going to integrate. Allow permission to a channel where you want Robusta to send results.


Accept to create a robusta platform account for monitoring the issues on the dashboard in the CLI. The output value will be generated as generated_values.yaml. Don't share your values with anyone in the public, make it private. Let's validate and install robusta now.

helm install robusta robusta/robusta -f ./generated_values.yaml -n robusta

It takes time to install in a k8s cluster. Here I have installed on the namespace called robusta instead of the default namespace. You will get notifications on your slack channel. You can also install Prometheus so that Robusta can suggest fixes for your existing Prometheus alerts and tell you why they happen.


Let’s perform some deployment stuff.

kubectl apply -f

Reference: crash deployment yaml via official robusta one.

As we have integrated Robusta with our slack channels, any issues on the pod or our nodes of the K8s cluster will be triggered by slack notifications. Recently we created a pod and the result comes up something like this.

crash pod

You can click and investigate via the Robusta platform. Here is the overview of Robusta dashboard with some issues raised in my EKS cluster.


Easy to detect what are the issues going on 😃. Some mitigation and possible solutions will be triggered too. Wanna know more about the enhancement of Prometheus alert via Robusta. You can check their documentation.


Finally, we can easily troubleshoot and automate to fix the issues on our K8s cluster.

Don't miss new blog posts! Subscribe to the blog updates and get SRE/DevOps topics direct into your inbox on SRE/DevOps newsletter.

TryHackMe - OWASP Top 10 writeup

It all begins when one of my fellows recommends me TryHackMe for learning a variety of infosec fields for being a better developer before going to HackTheBox.

The 5 lessons I learned from techwebinarnepal

One of the tech webinar series of nepal hosted by passionate Nepali tech admirer. Here are the few lessons I had learned from techwebinarnepal.

Awesome ebooks and remote jobs

Comprehensive list of awesome ebooks for developers and other professionals and remote jobs that will help you at a remote job.