Issue 025

Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. :D

Monitoring News, Articles, and Blog posts
Monitoring and Observability

Lots of hub-bub about “monitoring vs observability” lately! I really like this post for its in-depth treatment of what it means to build “observable” systems and how it’s different from plain-old monitoring.

A Field Guide to Observability (video)

A different take on the same topic of observability in a talk given at DevOps Days MSP recently. While not as much as a deep-dive as the article above, it’s a really great introduction to the topic.

Announcing Monitorama PDX 2018

It’s that time again folks: Monitorama has been announced! Sadly, all 100 of the early bird tickets sold out within hours. I recommend paying attention to the website for the opening of General Admission tickets and buying as soon as they’re up–they’ll go quickly!

3 lessons learned from an Elasticsearch game day

Datadog gives us a look behind the curtain at how they run a piece of the Datadog infrastructure. In this post, they walk us through a recent game day exercise they performed on an Elasticsearch cluster and the lessons they learned from it. Not familiar with game day exercises? This post by John Allspaw in ACM is a great starting point.

wisq/diefenbaker: Scripts for reporting useful server metrics to Datadog

A small collection of scripts just released and meant for reading data from various sources and sending to Datadog.

PromCon 2017: Conference Recap (videos)

The video recordings from PromCom (that is, the Prometheus conference) are up. I haven’t had the chance to go through all of them yet, but the ones I’ve watched are pretty good so far. If you weren’t able to make it to the conference, you should check these out.

LogDevice: a distributed data store for logs

The folks at Facebook Engineering open-sourced a new project: LogDevice. It’s a distributed data store for logs running on top of RocksDB. The whole thing looks really interesting. I’m not sure how much work (or usefulness) there is to integrate it into an environment that isn’t Facebook, but at the very least, you might learn a thing or two from the design.

oklog/oklog: A distributed and coordination-free log management system

ELK too complex for your needs? Grepping syslog flat files not flexible enough? oklog aims to solve that middle ground in tooling. It’s a pretty new project but looks like it could be neat for a lot of use cases.

Brian L. Troutwine on Twitter: “statsd is a bad protocol.”

I like statsd, I really do–it revolutionized the world of monitoring when it was introduced back in 2011 and continues to change how teams interact with their apps even today. But, as Brian points out, it does have some warts and growing pains. Maybe it’s time for the next iteration/spiritual successor?

See you next week!

– Mike (@mike_julian) Monitoring Weekly editor