I hope you all enjoyed our quarterly “Best of” review last week. Today’s issue is a fun collection of hands-on articles and guides, heavy on microservices, OpenTelemetry, and Prometheus topics. Enjoy!
This issue is sponsored by:
Regardless of where you are on your incident management maturity journey, there’s a right next step you can take. Learn about three areas of focus — roles, services, and retros — why they’re important, and how to improve at any level in "3 ways to improve your incident management program in 2023."
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
An excellent overview of service levels and related concepts, with actually helpful query examples. This article is on the longer side but well worth your time.
If you’re running disparate Prometheus servers, you’re probably making life difficult on your users. This guide will explain the basics of Prometheus Federation and get you started with a basic configuration.
Having to call out to external providers can make it challenging to maintain a high degree of observability. This post introduces a couple patterns for collecting OpenTelemetry spans in these scenarios.
A vendor-agnostic look at what Observability really means in a systems context. Useful for SRE folks and anyone else who cares for production services but might not actually be developing the services themselves.
Reading about how others think about alerts (and the failures of bad alerts) is something I’ll never get tired of, and unfortunately, something I think we’ll never really master as a discipline. Still, it’s important to share our learnings and continue evolving our practice.
I’m a little sad that this tool needs to exist (I wrote something similar for Graphite a decade ago?) but it does address a valid need. Worth adding to your sack of monitoring tools.
Just enough guide to get you up and running with OpenTelemetry and Zipkin before your coffee cools down.
I’m surprised to hear there are teams out there that will reopen incidents. If this is something you experience regularly, please read this article and then… just don’t.
We took our first look at the Micrometer project a few months ago, and it’s back again in the guise of someone equally surprised to discover it. I agree with the author here, the project creators did a nice job of designing a logging library that abstracts out some of the pillar-ness of modern Observability approaches.
I love this use of the Ingress-NGINX controller to tag traces for Grafana Tempo in a multi-tenant Kubernetes environment.
We don’t cover Kafka much here, in spite of its importance in many Observability pipelines. If you’re not already familiar with it, this article covers some of its more common and appropriate uses.
“An application observability facade for the most popular observability tools. Think SLF4J, but for observability.”
Monitorama has announced their full agenda for this year’s event. Looks like an awesome collection of topics and speakers. Hope to see you there!
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor