Issue 281
Great collection of articles this week, including a big update from Prometheus and other topics from PromCon EU. Loving the deeply technical posts from Netflix, Pinterest, and IBM too.
Also a quick note that I’ll be taking a short hiatus from the newsletter for the next couple of weeks, returning for our quarterly “best of” issue on October 6. See you then! 👋💗🚵
Articles & News on monitoring.love
Observability & Monitoring Community Slack
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
Noisy Neighbor Detection with eBPF
Excellent write-up from Netflix on their use of eBPF for low-overhead instrumentation and profiling. The noisy neighbor example is awesome, but their findings on eBPF optimization are just as interesting imho.
UI Improvements for Prometheus 3.0
Julius Volz offered a sneak peak of the UI improvements in the upcoming Prometheus 3.0 release. Tons of changes landing soon, check out the pre-release if you’d like to test it out and report any issues.
Prometheus 3.0 Unveiled: PromCon Highlights with Julius Volz
An interview with Julius Volz diving into the Prometheus 3.0 changes and other topics from the recent PromCon EU 2024. Really great stuff if you’re working with anything in the Prometheus ecosystem.
Improving Efficiency Of Goku Time Series Database at Pinterest (Part 3)
The latest post from Pinterest engineers on the evolution of their in-house TSDB. Even though this isn’t an open source project, I love reading about how they optimize write and read performance (and costs) in these systems.
OpenTelemetry and vendor neutrality: how to build an observability strategy with maximum flexibility
The ubiquity of OpenTelemetry has given users more power than ever before to avoid the hassles of vendor lock-in. But it’s not foolproof; there are still steps you can and should take to ensure that you’re using OTel effectively and giving yourself flexibility to adapt in the future.
Master Observability with OVM and OpenTelemetry
I haven’t heard anyone talking about this OVM project, but I found another announcement here and a whitepaper here that provide more context. Personally, I’d like to hear more about the real-world scenarios that inspired this design.
Developing an Automated Health Check for Cloud Services and Dependencies
This feels decidedly Nagios-like, but I can see where some folks might derive value out of something like this. OTOH it feels like it might suffer from drift pretty quickly.
VictoriaLogs: an overview, run in Kubernetes, LogsQL, and Grafana
Interesting look at VictoriaLogs, how it compares with Grafana Loki, and some of the missing bits that may hold back its adoption for now.
See you soon!
– Jason (@obfuscurity) Monitoring Weekly Editor