Issue 008

Thanks for joining us for another issue of Monitoring Weekly!

Monitoring News, Articles, and Blog posts
Logs and Metrics

A wonderfully-deep look at when you might want a metric versus when you might want a log, the role of unit tests versus monitoring, structured versus unstructured logging, whitebox versus blackbox metrics, and how all of this fits nicely into the umbrella of “observability.”

Circuit breaker and monitoring of a gRPC service in Ruby (Part 1)

One of the core requirements for stable and scalable microservices is good monitoring and reliability. Circuit breakers are a common tactic for increasing the stability of microservices, and this article looks at how you might monitor such behavior with statsd and the gRPC framework.

Introducing distributed tracing in your Python application via Zipkin

Distributed tracing has become a hot topic in this past year, but sadly, there’s not a lot of people talking about how they’re actually using it. This article gives a short demo of how to actually use Zipkin, an open-source distributed tracing tool, to instrument a Python app.

Take OpenTracing for a HotROD ride

Another demo app of using distributed tracing, this time with OpenTracing and a much more extensive application. This is a big article with a lot of really great stuff, so grab your coffee and hang on.

Sensor Monitoring – Providing Clean Energy in Africa

I think this serves as a periodic reminder that the things we do aren’t just used for behind-the-scenes, boring, commercial use-cases, but often have world-altering and life-changing uses too. I’d really love to see some mohttps://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941re of the technical implementation details behind this one.

CPU Utilization is Wrong

Think you know what %CPU in ‘top’ means? Prepare for a new perspective. Seems we all may have been tuning for the wrong constraint all along.

PagerDuty and Atlassian Collaborate for Faster Incident Resolution Times

PagerDuty has been on a roll these past few months with their emphasis on promoting and improving incident management practices for the community, and this latest improvement is (to me, at least) a long-time coming: JIRA-PagerDuty first-party integration. Gone are the days of hacky scripts to bridge the two.

Announcing the Modern Incident Resolution Lifecycle

Speaking of which, yet more really interesting and useful features from PagerDuty for tackling your incident management process improvements.

Monitoring Push vs Pull – InfluxData supports both

Ah, the age-old “push vs pull” monitoring argument. The folks at Influx have recognized that it’s not a clear-cut answer, and in response, have added some useful pull-based monitoring capabilities to Kapacitor by integrating some code found in the Prometheus project.

The PMCs of EC2: Measuring IPC

Performance Monitoring Counters (PMCs) are now available from AWS EC2 instances (dedicated instances only). Truth be told, I had no idea what these were, having not done a lot of work on performance analysis and tuning at such a low-level. Even if you don’t either, this is still a really interesting read.

Are Algorithms Better Than Humans?

A caution against over-using algorithms in decision-making on data. It’s a little bit meta for our purposes, but given that a lot of monitoring tools are headed in the direction of automated anomaly detection using machine learning approaches, I think the argument made here is well-taken.

Tools
Flow | A personal dashboard to focus on what matters

For something more on the fun side, this tool allows you to monitor…yourself! Best of all, it’s open-source and can be self-hosted.

Elastic Stack 5.4.0 released

A whole bunch of new features and bugfixes from the folks at Elastic.

Elastic Stack 6.0.0-alpha1 Released

If you like staying on the cutting edge, Elastic just dropped the 6.0 Alpha release. One of the more interesting bits is the capability to now upgrade from one major release to the next (5.x -> 6.x) without bringing the entire ES cluster down. That’s slick.

consul2dogstats, with Dimensional Tagging

A new tool to collect service health data from Consul and publish to Datadog.

Events & Meetups

(Do you have a monitoring-related meetup/event you want to announce here? Just email me!)

May ‘17 SF Metrics Meetup - San Francisco Metrics Meetup (San Francisco, CA)

If you’re in San Francisco, be sure to drop by this month’s SF Metrics Meetup!

Thanks for subscribing to Monitoring Weekly, folks! If you like what you’ve seen, invite your friends and colleagues! As always, if you have interesting articles, news, events, or tools to share, send them our way by replying to this email.

See you next week!

– Mike (@mike_julian) Monitoring Weekly editor