Issue 213
If this week’s collection is anything to go by, OpenTelemetry is eating the world. Still, a little bit of everything from distributed tracing to Graphite support, and even some tips for shrinking your Datadog bill. Enjoy! 🌞🍻📈
Articles & News on monitoring.love
Observability & Monitoring Community Slack
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
How We Improved Our Monitoring Stack With Only a Few Small Changes
An engineer from Riskified details their journey of scaling, streamlining, and generally improving the resilience of their monitoring infrastructure.
A fascinating look at how Slack traces the deliverability of their notification system. And what a nugget in the closing of the story… “at least a dozen tracers running simultaneously in the Slack app”. 👀
Distributed System Debugging with OpenTelemetry and Teletrace: Real-World Examples
Teletrace looks like an interesting new-to-me project for visualizing OpenTelemetry traces and debugging distributed systems. Anyone using this in production?
Datadog: Metrics without Limit
Some lessons learned for reducing custom metrics usage with Datadog’s “Metrics without Limits” feature. Woof.
Observability Driven Development (ODD) - Enhancing System Reliability
Maybe I’m too close to the problem, but this always felt like the desired state to me anyways. Regardless, if it helps adoption in your org by framing it in an acronym I’m all for it. 😉
Kubernetes 1.27: Query Node Logs Using The Kubelet API
More details about the new “Node log query” feature introduced with Kubernetes 1.27.
Releasing Graphite Query Language in Open Source VictoriaMetrics
An interesting update from VictoriaMetrics, announcing support for the Graphite query API in their open source release v1.90. I’m a little surprised there’s enough demand for this to justify the effort, but it still makes me smile.
Distributed Tracing: OpenTelemetry and Grafana Tempo
Another distributed tracing how-to, this one provides a bit more detail and relies on Grafana Tempo for querying and visualization.
Analyzing a Django App Using OpenTelemetry APM
A quick walkthrough for instrumenting and tracing your Django or Python web application.
Tools
“Teletrace is built from the ground up for modern applications. It is open-source and relies on open standards like OpenTelemetry. It is an easy-to-deploy scalable solution, that supports multiple storage options.”
Events
Monitorama has announced their full agenda for this year’s event. Looks like an awesome collection of topics and speakers. Hope to see you there!
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor