This was an unusually rich week for monitoring and reliability topics. If you’re a time-series nerd like me, I think you’ll really enjoy the benchmark comparison between VictoriaMetrics and Grafana Mimir. Enjoy! 📈☕✨
This issue is sponsored by:
Are you looking to modernize Log Analytics while controlling the cost?
DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. Get complete visibility into your entire stack and experience the DataSet difference for free.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
I’ve often preached to peers about the importance of monitoring and observability in the context of your product and users’ workflows. However, this is the first time I’ve heard of Critical User Journeys (CUJs); this strikes me as a fantastic way to frame this topic and to further the adoption of SLOs.
An impressively detailed look at how Cloudflare ensures that their Prometheus queries and alerts are as reliable as possible.
I always enjoy seeing how different companies approach building their own monitoring stacks. This week we have an engineer from Ninja Van sharing the details of their architecture.
An interesting read from Grafana around how they adapted Mimir to be compatible with time-series formats other than Prometheus.
The fear of changing our systems can be a paralyzing effect. This post looks at how we might better plan for change in a way that instills confidence, rather than eroding it.
Glad to see folks thinking about the intersection between test automation and observability. A really good primer to share with your favorite CI/CD engineers.
Engineers at VictoriaMetrics ran a performance benchmark against Grafana Mimir. Competition aside, it’s great to see teams across the two companies cooperating to ensure a level playing field. I’d love to see this continued as an ongoing series between the spectrum of TSDB systems out there.
P.S. I’m not at all surprised to see some of the data (e.g. memory use) from Mimir given their in-memory work to accommodate non-Prometheus metric formats.
An insightful yet concise review of a recent incident at Honeycomb.
I love discovering small, sharp tools like this one. If you’re ever in the need for monitoring live HTTP traffic in a tcpdump-like manner, but captured asynchronously in logs, this is a good place to start.
A quick but handy tip to keep in mind when monitoring your Kubernetes jobs.
“httpry is a tool designed for displaying and logging HTTP traffic.”
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor