Found some production debugging stories this week, along with a solid dose of Kubernetes, logging, and Prometheus posts. Enjoy! 🚢🌷🚲

This issue is sponsored by:

Chronosphere logo

Problem scenario: Your ratio of signal to noise feels completely out of sync, leading to a frustratingly large number of alerts. Alert thresholds become less relevant, and engineers on-call become numb to false positives. Learn how to pay down this “monitoring debt” and get better quality signals in our latest blog.

Articles & News on

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

How we reduced our Prometheus infrastructure footprint by a third

How one team profiled their Prometheus metrics usage and found a massive storage savings win.

How to collect and query Kubernetes logs with Grafana Loki, Grafana, and Grafana Agent

A fantastic guide from Grafana Labs for using their open source components to manage and monitor with Kuberentes logs.

The Story of One Mistake: How a Database Monitoring Tool Can Help a Developer

Debugging a performance regression after a simple change is always fun… right? Props to the author for capturing all of the visualization evidence for our entertainment. 😻

Node log access via Kubernetes API

Found this note in the K8s blog about an alpha feature (NodeLogQuery) allowing cluster admins to query node service logs.

Monitoring Made Fun: A Beginner’s Guide to Prometheus

A great article to share with your friends who might be new to monitoring in general, or Prometheus specifically.

AWS CloudWatch Logs — 101

A grab-bag of handy tips for AWS CloudWatch logs and alarms. Not just for beginners, either.

Fluent Bit: A brief introduction

I’d quibble with the characterization of this post as an “introduction” because the author walks through the key concepts, explains each stage of processing, and provides hands-on examples. IMHO this is a very solid guide to Fluent Bit.

The Linux Process Journey — “kdamond”

I might be late to the party here, but this is the first I’ve heard of DAMON. Hopefully we’ll start to see more examples of it in the wild soon.

Progress Reporting In PostgreSQL

We don’t usually cover database monitoring in depth here, but when I saw these PostgreSQL queries I had to share this one. Dare you to copy/paste it into production. 😆

Deploying Prometheus and Grafana using Helm in EKS

A simple guide that does precisely what the title says it does.


Monitorama 2023 PDX

Monitorama has announced their full agenda for this year’s event. Looks like an awesome collection of topics and speakers. Hope to see you there!

Job Opportunities

Engineering Manager SRE at CircleCI (CA Remote)

Database Administrator at FORM (US Remote)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor