Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. :D
Monitoring News, Articles, and Blog posts
Monitoring in a DevOps World
We all understand that infrastructure has changed dramatically in recent years, as new methodologies and tools have started spreading. But has your approach to monitoring and observability changed with it? Are you sure?
Why We Built Our Own Distributed Column Store by Sam Stokes - (video)
For those of you out there building monitoring products (rather than using/running them), you’ll like this talk on how Honeycomb designed and built their backend storage. One of the biggest challenges with building a monitoring product comes down to the datastore, so this is a really interesting watch.
Observability: it’s not just an ops thing by Christine Yen - (video)
Another one from the folks at Honeycomb, this time about why observability matters so much to developers.
Zebras All the Way Down - Bryan Cantrill (video)
If you’ve not had the pleasure to watch a Bryan Cantrill talk before, you’re in for a real treat. In this talk, Bryan goes into what it’s like to run an infrastructure of so-called “zebras” and the unique challenges presented. In other words, it’s all about observability.
There are quite a few different components and facets to monitoring in AWS using AWS-provided services. CloudWatch itself is a gigantic monstrosity of UX with plenty of data to match. This posts helps you make sense of all of it.
What do you personally do to make on-call better for your coworkers and team? Even if you’re not in ops, there’s quite a bit you can do, it turns out. This article takes the form of lots of feedback via Twitter on all the different ways you can make the on-call experience better for you and your colleagues.
Graphite 1.1: Teaching an Old Dog New Tricks
Graphite 1.1 is out and it’s got some very welcome new features, chief among them: tagging support. There’s also metric piping, Python 3 support, and much more. Definitely check this out if you’re using Graphite in your infrastructure.
Mine all the data, they said. It will be worth your while, they said
I had someone remark to me a while back, “We’ve built this magnificent haystack of metrics and logs and now wee need to find the needle in it, except we aren’t even sure the needle is even there.” That stuck with me and underscored a major pitfall in the “instrument everything!” approach. This article talks quite a bit more about it. My favorite bit: CERN generates 1PB of data every second and throws away 0.999975 of all data collected. That’s wild.
See you next week!
– Mike (@mike_julian) Monitoring Weekly Editor