Everyone’s favorite book, the Google Site Reliability Engineering book, now has a companion book: The Site Reliability Workbook. This new book aims to be the practical application of the original book, which was a whole lot of theory. Looking at the table of contents, there’s a lot of great stuff about monitoring, incident management, and more. The PDF is free for download until August 23rd, 2018, or you can preorder the physical copy at Amazon.
Speaking of awesome books, I’ve been looking forward to this for a while: Brian Brazil of Prometheus fame has announced that his O’Reilly book on Prometheus is out and shipping soon. What’s really impressive (to me, anyways) is how quickly it came to fruition. It took me 19 months, start-to-finish, for Practical Monitoring, and only six months for Brian. 🙌
The folks at Honeycomb have written up a neat little guide about observability in practical terms. It’s behind an opt-in wall, but I recommend grabbing it anyways–it’s a great read.
With all this talk about “high cardinality” around monitoring lately, this article finally explains what it really means in concrete examples.
Well this is neat: some folks are trying to come up with a specification for describing, defining, measuring, and reporting on SLAs for public APIs. You can read the spec at the ISA Group’s Github repo (isa-group/SLA4OAI-Specification).
Continuing this series on Kubernetes metrics, the author now hits on etcd. If you’re joining for the first time or want a refresher, here’s the start of the series.
I don’t know why it’s never occurred me to use a Sankey diagram in performance monitoring/analysis, but it’s a genius idea. This article does a bit of traffic analysis and visualization using them in ELK; it’s light on more details, but it should be plenty to get you thinking about applicability in your own environment.
This projects aims to be a cross-platform interactive metrics system (think: top, systat, etc) and it’s pretty nifty. It runs on Linux, BSD, OSX, and Windows, has a built-in API and web UI, and can push metrics to CSV, Influx, statsd, and a whole bunch of others. Super neat project.
Going to be in Europe this fall? Come hang out with me in Nuremberg this November and hear me prattle on about all the terrible cars I’ve driven over the years, and what they taught me about monitoring.
Speaking of which, I’ll also be in Kansas City for DevOpsDays. I’m speaking about monitoring, yes, but I’ll mostly be stuffing my face with barbecue and starting fights about how Tennessee barbecue is clearly better (Go Vols!).
Sensu has graciously offered a discount code for all Monitoring Weekly readers! Use
MonitoringWeekly at checkout for $50 off the early bird ticket.
Are in Europe? Everyone’s favorite monitoring conference, Monitorama, is coming to a canal near you! Monitorama kindly sent over a discount code for Monitoring Weekly readers: use
MLOVEWEEKLY at checkout for €100 off.
See you next week!
— Mike (@mike_julian) Monitoring Weekly Editor