Jamie Wilkinson’s slides from his recent Velocity NYC talk on SLOs are available. There’s enough notes in the slidedeck to make this useful without the video. I can’t wait for the video, though–looks like a great talk.
These articles remind me of why I have a long way to go with my grasp of stats.
Want more security around your Prometheus endpoints and fluentd config? This article has you covered.
Visualization goes hand-in-hand with great monitoring but I’ve found too few of us really think hard about it. This article isn’t about monitoring at all, but rather talks about the business side of things and visualizing business KPIs. That said, there are great takeaways for those of us building visualizations or just creating charts for a report every now and then.
We’ve known for some time that using the average for things like latency results in missing a ton of data, which is why using 95th or 99th percentile is now common. But the author makes another point: many vendors implement percentiles in a pre-aggregated way, resulting in the same problem.
The cofounder at Wavefront talk a bit about their thoughts on time series architecture, must-have/nice-to-have features, and more.
Slides from a recent talk from Charity Majors, and it’s awesome. The beginning is more about why you should be testing in prod, but keep reading–it gets into some great observability stuff, including an apt comparison of monitoring a monolith vs distributed system.
Exactly as the title says: The folks at TimescaleDB suggest an architecture for highly-available Prometheus backed by Postgres and TimescaleDB.
A neat tool to ship web performance metrics (eg Lighthouse) to InfluxDB + Grafana.
Quote the author: What happens when you boot up a Pod? What happens to a Service before it is allocated a public IP address? How often is a Deployment’s status changing? kubespy is a small tool that makes it easy to observe how Kubernetes resources change in real time …
A new tool that offers a lightweight alternative to kube-state-metrics for k8s resource metrics.
Chock full of awesome stuff, too, including Stackdriver as a core datasource and a much-improved Postgres query builder.
For those of you with Kafka in your monitoring streaming pipelines, this could be useful: a tool for replicating data between multiple Kafka clusters.
For the dtrace fans among you, rejoice in this awesome news.
Splunk is putting together a new event series that looks pretty neat, and it isn’t just a Splunk event in disguise. Best part: it’s free.
Nothing scheduled yet, but if you’re in the Las Vegas, get on over there and join the group.
I’m launching a job board for monitoring and observability jobs. If you’ve got some monitoring/observability roles you’re trying to fill, how about heading on over there? I’ll be including them here in the newsletter as well.
See you next week!
— Mike (@mike_julian) Monitoring Weekly Editor