Issue 202
This week has some serious “where are we and how did we get here vibes”. From dashboard design to telemetry and tracing collection to the history of observable systems, we’ve got a bit of everything. Enjoy! 📈📚🍻
This issue is sponsored by:
Ready to kickstart your incident response improvement efforts in 2023? Join FireHydrant on Wednesday, Feb. 8 for a webinar on How to evaluate and improve how you manage incidents. Learn what metrics you should monitor, common benchmarks, and how to show improvements and prove ROI.
Articles & News on monitoring.love
Observability & Monitoring Community Slack
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
Unreadable Metrics: Why You Can’t Find Anything in Your Monitoring Dashboards
I “grew up” in this industry cutting my teeth on dashboard design and usability. Tools like Grafana make this a lot easier than it used to be, crafting your own charts and pages usind D3.js. Still, it can be almost too easy to vomit a bunch of graphs on a monitor and call it a day. This article does a good job calling out the design considerations that will turn the dashboard into a truly useful resource for your team.
Can We Stop With Those Horrible “System Overview” Dashboards Already?
Another examination of dashboard design, this time with an emphasis on the telemetry and signals used to inform our dashboards and the responders who rely on them.
Observability: The hidden stories in your data
If you’re not already leveraging logs in your Observability story, this author has a bone to pick with you. Seriously though, this is a solid look at why structured logging can help you surface more insights from your systems.
Alerting and how 50 lines of code changed how we do it
Really appreciate when an engineer works through a complex problem and shares their solution publicly. I learned a lot more about ElastAlert (and a little Scala) than I expected, tbqh.
Guide to Distributed Tracing with OpenTelemetry Dotnet
We haven’t seen many distributed tracing stories lately, so it’s refreshing to include a guide for setting up OpenTelemetry spans with .NET projects.
Distributed tracing with OpenTelemetry in your Go/Python microservices
Wait, didn’t I just say… oh well, here’s another distributed tracing example. This time for using OpenTelemetry with Golang and Python services. 😂
Open-Source Tracing Tools: Jaeger Vs. Zipkin Vs. Grafana Tempo
If you enjoyed the OpenTelemetry articles above but haven’t taken the plunge for yourself, this article provides a thorough comparison of the major players in open source distributed tracing backends.
MetricFire is a hosted monitoring solution that allows you to get the data you need. We offer an easy-to-use product with beautiful open-source dashboards and enhanced alerting. Using a tried and true Graphite infrastructure, MetricFire is the fastest and easiest way to monitor your metrics. Get started for free or book a demo here. (SPONSORED)
Observability, Runbooks, and Postmortems (oh my!)
This post touches on a variety of incident-related topics without taking itself too seriously. If you enjoy this one and consider yourself new to SRE topics, I’d recommend hopping over to Google’s site and reading the free online copies of their SRE books.
Observability, Monitoring, Alerting
I love that this author somehow managed to squeeze concepts and details from a dozen different references into one cohesive story. Breeze through this one and then allow yourself to dive into the list of references at the end.
Tools to manage SLOs and error budgets
A top-down review of SLOs, error budgets, and a variety of tools to help your teams manage them.
Learn the history — The Path to Observability
A look back on the types of telemetry sources that have influenced how we think about Observability in modern software systems.
New Grafana 9.3.x and 9.2.x releases to address high and medium severity CVEs.
Events
CFP open for Monitorama PDX 2023
This is the final week to submit talks for this year’s Monitorama PDX 2023 event. Get your proposal in by the Feb 3 deadline! 🤩
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor