This issue is sponsored by:
81% of the world’s busiest domains are open to outages because of poor DNS setup. See how yours compares across several DNS security measures - Know how you might be in jeopardy - And the three easy fixes to make. Get Panopta’s research report [The Perilous State of Global Web Domains] in 2019 today.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
Monitoring and observability is something near-and-dear to my own heart, so I was excited to talk to Christine Yen, Cofounder & CEO of Honeycomb, about observability, why dashboards aren’t as helpful as you think, and the value of being able to ask questions of your own application and infrastructure when you’re troubleshooting.
From The Community
eBPF at Cloudflare, used in a network-heavy, CDN context. Very cool stuff.
I love this article because it’s actually a look at the internals of Python’s
logging module. I’ve been writing Python for years and I learned new stuff about this module pretty much right away. For example, did you know the LEVEL constants (eg, INFO, ERROR, DEBUG) are actually just integers? Very neat.
I won’t ruin the punchline for you, but you would be shocked how often this occurs even in multi-million dollar companies.
There’s a great story behind this all culminating in this new piece of open source code that publishes certificate expiration of apps running on k8s in Prometheus format.
For those curious about what’s been going on with Cortex, the multi-tenant Prometheus system, the folks at Grafana have some updates, including a plan for reaching 1.0 soon and making regular releases.
How many of you have the concept of a severity level for incidents? This article makes a really compelling case that there’s much more than one definition for the usage of severity levels. Sadly, there’s no recommendations for what to do next–hopefully a future article.
SLOs can be tricky to understand and implement properly. This talk can definitely help fix that.
This post has both observability AND robots. What more do you want?
From the article: “The Emergency Services team at Trapeze Group provides 24/7/365 support for ambulances in Australia. Each fleet can contain as many as 1,000 vehicles, with more than 60 telemetry channels and 120 million messages going in and out to paramedics every day.”
This issue is sponsored by:
The folks at Raygun have some thoughts on CI/CD, but rather than bore you with product news, they wrote an article about how they do CI/CD for Raygun itself. Check it out.
Yes, you read that right: Monitorama is doing a new event on the American east coast! I’m super excited.
Yes! REdeploy is one of the great conferences of the last few years and it’s coming back to San Francisco for round 2. The CFP is open and early bird tickets are for sale.
See you next week!
– Mike (@mike_julian) Monitoring Weekly Editor