Issue 133

I love this week’s collection, particularly the production use cases and debugging stories. Oh, and a bunch of recorded content from eBPF Summit 2021. Enjoy!

This issue is sponsored by:

Rootly logo

Manage incidents directly from Slack

Rootly helps automate the tedious manual work like creating incident channels, searching for runbooks, documenting the postmortem timeline, and more. Teams sized 20 to 2000 manage hundreds of incidents daily and save thousands of engineering hours a year within Rootly. Get started in <5min or book a demo to learn more and get Starbucks ☕ on us!

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Computers are the easy part

Fantastic article from Mailchimp’s developer blog on a recent outage and how their incident response played out.

Using SLOs to Pursue User Happiness

Great article from Betterment’s engineering team. Besides a lot of really good coverage on SLOs, the framing of the topic as it pertains to customers and users makes my heart sing.

Prometheus Blackbox: What? Why? How?

I’m not a fan of the name, but the Prometheus Blackbox exporter is here to stay. This might be the most exhaustive article I’ve read on why you might need it and how to leverage it effectively.

Unpacking Observability: The Observability Stack

If you’re considering outsourcing your in-house observability stack, you’re not alone. The author makes a compelling argument as they evaluate a handful of vendors for their observability needs.

Securing Prometheus Scrapes with the Kuma Service Mesh

Having worked for a company that had to TLS All The Things™ (including Prometheus scrapes), the notion of having a secure service mesh handle all of that for me is a compelling alternative.

It’s always DNS. A kubectl story.

Monitoring and observability are nothing without solid investigative and debugging skills. And you know I love a good debugging story.

Honeycomb Is All-In on OpenTelemetry

Glad to see Honeycomb and others investing heavily in open instrumentation standards. A very thorough look at how this benefits Honeycomb customers, but also a great read if you’re interested about the OpenTelemetry ecosystem.

Time series forecasting for Prometheus & Grafana with BigQuery ML

I haven’t gotten to try this out myself but it looks like it could surface some interesting data. Anyone familiar with any realistic use cases?

Connecting Monika with Prometheus

Monika is a new (to me) synthetic monitoring tool for your websites and web applications. This article walks through the basic usage of Monika and how it can export its results into Prometheus and Grafana.

Guiding Observers Through Prometheus’ Architecture

An overview of a fairly typical Prometheus infrastructure and related services.

How to Detect Security Threats in Your Systems’ Linux Processes

This is probably familiar territory for your average DevOps / SRE / Systems Engineer, but it’s never a bad time for a security refresher.

Loki 2.3 is out

If you’re already using Loki, version 2.3.0 sounds like a worthy upgrade. Oh, and this is the first Loki release under the AGPLv3 license.

Tools

hyperjumptech/monika

“Monika is a command line application for synthetic monitoring. The name Monika stands for ‘Monitoring Berkala’, which means ‘periodic monitoring’ in the Indonesian language.”

Events

Call for Presentations - linux.conf.au 2022 Systems Administration Miniconf

Linux.conf.au has opened the CFP for next year’s online Systems Administration Miniconf. Early submissions deadline expires this coming week (2021-08-25).

eBPF & Cilium Community

Talks from eBPF Summit 2021 are available online for your weekend binging.

Job Opportunities

Senior Platform Security Engineer at Clear (US, NYC)

Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor