Issue 232

Lots of love for logging and incident reviews this week, plus a helpful guide for the Prometheus Certified Associate (PCA) exam. Enjoy! 🔥🪓👩‍🎓

This issue is sponsored by:

Chronosphere logo

Many distributed tracing tools make finding an error slower than expected and ultimately end up causing everyday users to find other means of finding an error quickly. In this 30-minute demo on-demand, learn how Chronosphere is democratizing distributed tracing by making trace data more easily understandable.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Prometheus Certified Associate: A Comprehensive Guide

An overview of the PCA exam from a candidate who passed it recently. What it covers, how to prepare for it, and which resources are available to help you study.

Sofia’s Observability Odyssey: A Blackbox Monitoring Adventure with a Chocolate Twist

The next chapter of Sofia and her journey into observability enlightenment. Honestly, I have no idea where this is going but at least it’s creative and not written by AI (🤞).

How to Combine Observability-driven Development and Security

I’ve seen numerous articles detailing the collaborative benefits of leveraging observability tools and processes towards security goals, but this might be the first one I’ve seen that really drills into the topic with enough specificity where I feel like I learned something. Very well-written piece that I recommend sharing with your peers on both teams.

Incident Review: What Comes Up Must First Go Down

Few companies do an incident review like Honeycomb, and this one is no different. I always appreciate the transparency they demonstrate in these posts because it offers a great learning experience for others.

Send your logs to Loki

Loki seems to be gaining a lot of mindshare in the logging space. Here’s a quick post demonstrating one pattern for storing logs using its API.

Grafana Loki 2.9 release

Speaking of Loki, Grafana just released version 2.9. This looks like a mild set of changes, but it’s always good to see reliability and documentation improvements.

Vendor-neutral Application Observability with Micrometer.io

While OpenTelemetry is becoming ubiquitous for observability needs, Micrometer maintains a strong presenence within the world of JVM-based applications. This post introduces new users to the Micrometer library with some quick examples and next steps to consider.

Building a Clearer Picture: Rethinking Log Messages and Log Pipeline for Superior Observability at 1mg

A look at how one company reimagined and rearchitected their log infrastructure. Weird cliffhanger at the end though; hopefully we find out later why they pivoted in their next post.

A guide to observability at Birdie

Great to see so many companies “getting it” in terms of observability and equipping engineers with the data and tooling to make the right decisions. Unfortunately this one also closes like a bit of an advert for a vendor; wish we could get more details on how they use these products to solve actual problems. 🤷‍♂️

The Trout / Incident Coefficient

Although I don’t understand the fish reference, I always appreciate reading how other companies think about incident response and learning reviews. In my experience, this is one of those disciplines where we’re always learning from one another.

Grafana security update: Post-incident review and timeline for GPG signing key rotation

Grafana published their Post-incident review (PIR) for the recent rotation and re-signing of their packages with a new public key.

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor