Issue 093

Did you know I have a podcast too? Check it out: Real World Devops

This issue is sponsored by:

InfluxData logo Using GraphQL with InfluxDB and Flux

I am rather late to the GraphQL party, personally, so this article is doubly helpful. Combining GraphQL and Flux together yields some really powerful and interesting possibilities.

From The Community

RUM vs. APM: How They’re Similar and Different

This is an interesting take on things: RUM is more of a technique or way of monitoring a specific thing, whereas APM is a much broader category that encompasses RUM.

A CIO’s take on preparing for SaaS outages

Coming off the recent Zoom outage, this article makes some timely suggestions including one I’ve been harping on for a while: monitor your external dependencies just like your internal systems.

Fearless shared postmortems

A lot of stuff we read about postmortems are for an internal audience. That’s great. This article talks about postmortems for an external audience.

Hey, you busy? I have thousands of questions to ask you.

The folks at Curalate wrote up how they went about preparing for Black Friday/Cyber Monday traffic. I love their approach of consulting historical data and then generating load to match those levels. Simple and very effective.

Lambda Logs in ELK

It’s really just Lambdas all the way down, except this time, it’s resting on the back of a big ass elk.

Automated Internet Speedtests for Distributed Networks

I’m glad to know the folks at Chick-Fil-A take their job seriously enough to implement automated network speed testing at all locations. I shudder to think about the possibility that I might not get my nuggets. <3

When Does an Investigation End?

I’ll let you read the article yourself, but my favorite line: As with all complex topics, the answer to “When does an investigation end?” is “it depends.” And even that may not be valid.

On Infrastructure at Scale: A Cascading Failure of Distributed Systems

I love these sorts of stories about “shit went sideways, we fixed it, and here’s what we learned.”

Managing reliability with SLOs and Error Budgets

The folks at Kudos teach us about SLOs, SLIs, and error budgets, as well as talk through their own implementation of them.

tersesystems/terse-logback: Structured Logging Example with Logback

For the Java folks in the audience: Logback, to quote the website, “is intended as a successor to the popular log4j project, picking up where log4j leaves off.” This repo is basically an in-depth example of how to use it.

Winston: A Better Way To Log

And while we’re at it, here’s one for the JavaScript folks: “Winston is a JavaScript logging library that makes logging to various, persistent, storage locations, like databases or files, much simpler.”

This issue is sponsored by:

GitPrime logo 📈 Data-Driven Guide to Engineering Leadership

Ship faster because you know more, not because you’re rushing. Get actionable insights from 7 million commits and 85,000+ software engineers, to increase your team’s velocity. Free Guide

Events

GrafanaCon Los Angeles, CA - February 25-26, 2019

The folks at Grafana Labs <3 all you Monitoring Weekly readers too, so they’ve offered an exclusive discount code to the event. Use code MLOVEWEEKLY-19 at checkout for $100 off.

Jobs

Want your job listed here? Why not submit a post to the job board? It’s only $99/ad for 30 days.

See you next week!

– Mike (@mike_julian) Monitoring Weekly Editor