I ran across this book from 2014 last week. It’s a short read (50 pages) but pretty great. It’s not so much about “and here’s why ELK is great,” but more conceptual, eg “What is a log? Why is it important?” and how logs are/should be used in a distributed services environment. If you like distributed systems, Kafka, and architectural challenges, this is a good book to read. I read the physical (linked above), but there’s an ebook available here.
Breaking down large problems into smaller problems is a tried-and-true method of solving problems and finding insight, and this article on logging does just that. The article makes the observation that logging is really five separate problems.
Just your basic overview of CloudWatch terminology and functionality. If you’re not that familiar with CloudWatch and you’re also stuck in the never-ending maze known as ‘The Amazon Documentation’, read this.
We borrow a lot from other fields in the course of trying to understand our own (incredibly young) field. This article makes comparisons to the medical field to discuss health, availability, and more.
Well, this is a new one: replacing Whisper with ClickHouse. Pretty much everyone I know just replaces the whole thing with Influx or something, so this is definitely worth reading. The authors do note a couple deficiencies in the new stack worth mentioning, such as lack of tag support in ClickHouse.
Maintainers from Graphite, Prometheus, InfluxDB, and TimescaleDB share a stage and talk about time series databases. I’m waiting for the cage match where rrdtool inexplicably wins.
This is an incredible talk by the author of HAProxy on understanding what’s going on with your systems from the perspective of HAProxy. I knew HAProxy exposed a lot of data, but I hadn’t realized exactly how much work went into it.
PromCon is fast approaching–got your ticket yet?
See you next week!
— Mike (@mike_julian)
Monitoring Weekly Editor