This was a surprisingly rich week for fun and intriguing articles, with a particular emphasis on Kubernetes, Prometheus, and sustainable practices. I hope you enjoy them as much as I did! 😸📈☕
This issue is sponsored by:
Are you looking to modernize Log Analytics while controlling the cost?
DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. Get complete visibility into your entire stack and experience the DataSet difference for free.
Articles & News on monitoring.love
It’s been amazing to see the community continue to grow. We’d love to have you join us and share what you’ve been working on.
From The Community
I love this tale of how Reputation (the company) approached their distributed service reliability concerns. Unlike a lot of SLO stories I’ve read, this is a very approachable one that can serve as a model to other growing companies.
We’ve seen a bunch of “how to Prometheus” articles here, but I’m not sure I’ve seen one this concise but also quite so full of helpful pointers and references. Definitely give this one a look if you’re new to Prometheus or just want a quick refresher.
This isn’t the typical topic we cover here, but in light of the current state of the tech industry, I felt it would be prudent to share this with you all. This is an excellent article on sustainable work environments and each of us should be able to take away some valuable lessons from this post.
I can’t vouch for the why but if you’re considering a move from Thanos to Mimir, this guide should help with the how.
This story of a disk performance issue on Kubernetes really hits close to home. It strains credulity that the underlying cAdvisor issue still hasn’t been fixed, at least seven years after the original bug report.
A recap of one vendor’s experience at Kubecon and the related observability events.
On a related note, the CNCF have uploaded videos and provided a playlist of talks from the recent PrometheusDay NA event.
A discussion on observability principles and benefits, framed in the context of Pipedrive’s own architecture and engineering needs.
In order to monitor a thing properly, we need to understand it first. How many times have you had to dig into some obscure performance issue, only to end up combing through kernel man pages (or worse, source code)? Save yourself some time and keep this post at arm’s reach.
Grafana recently announced a couple of new OSS projects, but I found this one the more interesting of the two. I haven’t tried it yet, but it sort of reminds me of a modern take on Riemann. Hopefully this one doesn’t require me to learn Clojure (sorry, Kyle).
A quick but handy pattern for monitoring your ECS task deployments using Amazon EventBridge.
Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor