This issue is sponsored by:

Panopta logoResearch Report: How Can You Tell If Your DNS Is Completely Secure?

81% of the world’s busiest domains are open to outages because of poor DNS setup. See how yours compares across several DNS security measures - Know how you might be in jeopardy - And the three easy fixes to make. Get Panopta’s research report [The Perilous State of Global Web Domains] in 2019 today.

Articles & News on

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

Understanding Observability (and Monitoring) with Christine Yen

Monitoring and observability is something near-and-dear to my own heart, so I was excited to talk to Christine Yen, Cofounder & CEO of Honeycomb, about observability, why dashboards aren’t as helpful as you think, and the value of being able to ask questions of your own application and infrastructure when you’re troubleshooting.

From The Community

Cloudflare architecture and how BPF eats the world

eBPF at Cloudflare, used in a network-heavy, CDN context. Very cool stuff.

Python Logging: A Stroll Through the Source Code

I love this article because it’s actually a look at the internals of Python’s logging module. I’ve been writing Python for years and I learned new stuff about this module pretty much right away. For example, did you know the LEVEL constants (eg, INFO, ERROR, DEBUG) are actually just integers? Very neat.

Monitoring everything…?

I won’t ruin the punchline for you, but you would be shocked how often this occurs even in multi-million dollar companies.

When Good Certificates Go Bad: Monitoring for Expired TLS Certificates

There’s a great story behind this all culminating in this new piece of open source code that publishes certificate expiration of apps running on k8s in Prometheus format.

Grafana Labs at KubeCon: The Latest on Cortex

For those curious about what’s been going on with Cortex, the multi-tenant Prometheus system, the folks at Grafana have some updates, including a plan for reaching 1.0 soon and making regular releases.

The Negotiability of “Severity” Levels

How many of you have the concept of a severity level for incidents? This article makes a really compelling case that there’s much more than one definition for the usage of severity levels. Sadly, there’s no recommendations for what to do next–hopefully a future article.

Practical Service Level Objectives With Error Budget

SLOs can be tricky to understand and implement properly. This talk can definitely help fix that.

Defining Observability For Robotics

This post has both observability AND robots. What more do you want?

Using Grafana to Monitor EMS Ambulance Service Operations

From the article: “The Emergency Services team at Trapeze Group provides 24/7/365 support for ambulances in Australia. Each fleet can contain as many as 1,000 vehicles, with more than 60 telemetry channels and 120 million messages going in and out to paramedics every day.”

This issue is sponsored by:

Raygun logoRaygun’s Continuous Delivery Process

The folks at Raygun have some thoughts on CI/CD, but rather than bore you with product news, they wrote an article about how they do CI/CD for Raygun itself. Check it out.


Monitorama Baltimore 2019 - October 21-22, 2019 - Baltimore, MD USA

Yes, you read that right: Monitorama is doing a new event on the American east coast! I’m super excited.

REdeploy is REturning! - October 16-17, 2019 - San Francisco, CA USA

Yes! REdeploy is one of the great conferences of the last few years and it’s coming back to San Francisco for round 2. The CFP is open and early bird tickets are for sale.

Observability Meetup - May 30, 2019 - Boston, MA USA

London Monitoring Summer Meetup - June 12, 2019 - London, UK

See you next week!

– Mike (@mike_julian) Monitoring Weekly Editor