A fun week of stories with an emphasis on logging, incident response, and alerting. Some interesting tools to play with this weekend too… enjoy! ☕🍂🧠

This issue is sponsored by:

DataSet logo

Are you looking to modernize Log Analytics while controlling the cost?

DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. See DataSet in action at KubeCon or SREcon Europe and get a personalized demo, collect awesome swag, and win exciting prizes.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Smartening incident recovery

A look at how Razorpay improved their platform resilience by introducing self-service alerting and automated incident management.

Incident Review: Shepherd Cache Delays

A postmortem of an incident affecting Honeycomb’s ingest system last month. It’s always interesting to discover unexpected failure patterns for systems that you swear you know completely; props to their engineers for sharing lessons learned from this outage.

Getting started with Prometheus

A very handy guide for anyone new to Prometheus monitoring and alerting. This is a good one to share with any developers new to our domain.

SRE Journey — Alerting your SLO Part 1

Examples for alerting on your service levels with Prometheus and PromQL.

How to Tail Kubernetes Logs: kubectl Command Explained

A solid primer on Kubernetes logging and interacting with the logs using kubectl.

Observability Mythbusters: Observability Anti-Patterns

A reminder that not everything with “Observability” in the name really is.

AWS Build On Observability Day - Show notes

Show notes and videos from Amazon’s recent “Build On Observability Day” event.

Automated Distributed Tracing Using eBPF (Part 1)

There are different approaches for handling distributed tracing context propagation. In this article, ContainIQ explains their use of metadata based correlation with eBPF.

2022 State of Logs Report

New Relic has published their annual “State of Logs” report with some interesting takeways and developing trends.

Murre - the lightweight K8s metrics monitoring tool

Groundcover has released a new OSS tool that looks a lot like “top”, but for Kubernetes clusters.

URL monitoring made easy: self-hosted open-source tool for checking your website availability

How to host your own Pingdom-like website monitoring with the open-source HotHost project.



Lightweight and minimalistic free and opensource Servers and HTTP monitor.


Murre is an on-demand, scaleable source of container resource metrics for K8s.


Monitorama PDX 2023 - June 26-28 (Portland, OR)

Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor