Lots of love for logging and incident reviews this week, plus a helpful guide for the Prometheus Certified Associate (PCA) exam. Enjoy! 🔥🪓👩🎓
This issue is sponsored by:
Many distributed tracing tools make finding an error slower than expected and ultimately end up causing everyday users to find other means of finding an error quickly. In this 30-minute demo on-demand, learn how Chronosphere is democratizing distributed tracing by making trace data more easily understandable.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
An overview of the PCA exam from a candidate who passed it recently. What it covers, how to prepare for it, and which resources are available to help you study.
The next chapter of Sofia and her journey into observability enlightenment. Honestly, I have no idea where this is going but at least it’s creative and not written by AI (🤞).
I’ve seen numerous articles detailing the collaborative benefits of leveraging observability tools and processes towards security goals, but this might be the first one I’ve seen that really drills into the topic with enough specificity where I feel like I learned something. Very well-written piece that I recommend sharing with your peers on both teams.
Few companies do an incident review like Honeycomb, and this one is no different. I always appreciate the transparency they demonstrate in these posts because it offers a great learning experience for others.
Loki seems to be gaining a lot of mindshare in the logging space. Here’s a quick post demonstrating one pattern for storing logs using its API.
Speaking of Loki, Grafana just released version 2.9. This looks like a mild set of changes, but it’s always good to see reliability and documentation improvements.
While OpenTelemetry is becoming ubiquitous for observability needs, Micrometer maintains a strong presenence within the world of JVM-based applications. This post introduces new users to the Micrometer library with some quick examples and next steps to consider.
A look at how one company reimagined and rearchitected their log infrastructure. Weird cliffhanger at the end though; hopefully we find out later why they pivoted in their next post.
Great to see so many companies “getting it” in terms of observability and equipping engineers with the data and tooling to make the right decisions. Unfortunately this one also closes like a bit of an advert for a vendor; wish we could get more details on how they use these products to solve actual problems. 🤷♂️
Although I don’t understand the fish reference, I always appreciate reading how other companies think about incident response and learning reviews. In my experience, this is one of those disciplines where we’re always learning from one another.
Grafana published their Post-incident review (PIR) for the recent rotation and re-signing of their packages with a new public key.
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor