Hope you’re all doing well as we begin (or continue into) the seasonal holidays. Plenty of great articles this week covering monitoring and incident response topics. Enjoy! ⛄💗🐧

This issue is sponsored by:

DataSet logo

Are you looking to modernize Log Analytics while controlling the cost?

DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. Get complete visibility into your entire stack and experience the DataSet difference for free.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Monitoring a Multi-Node Docker Swarm Stack with Grafana, Prometheus, and 100 Lines of Python

A helpful pattern (and free script) for monitoring a Docker Swarm stack with Prometheus.

Observing Hachyderms

Another Mastodon-related post this week, this time with a deeper dive on Hachyderm.io observability as they continue to deal with the influx of new users and scaling challenges.

How to Monitor the Kubelet

An excellent guide for understanding the Kubelet’s role and how to monitor it effectively.

The Incident Retrospective Ground Rules

Lex Neva (a former peer and someone I respect greatly) shares some important considerations for any healthy Incident Retrospective process.

LeakProf: Featherlight In-Production Goroutine Leak Detection

Go is a hugely popular programming language, but it’s also susceptible to a variety of goroutine memory leaks. This post from Uber details their in-house leak indicator and demonstrates how they use it to help diagnose a variety of leak types.

Indexing logs from AWS S3 to Elasticsearch

If you’re looking to index your AWS ALB logs from S3 to Elasticsearch, this engineer presents a solution which they claim scales better than Logstash and without duplicate records.

Loop1 logo

How do you stack up in terms of monitoring maturity?

Take the Loop1 L1M3 calculator and gain insight into your current IT Operations maturity level and the steps you should take to shift the needle toward insightful, where IT helps to drive business outcomes via effective analytics and business intelligence. (SPONSORED)

Adding Custom Metadata While Sending Logs with Filebeat

It can be super helpful to add custom tags to your logs. Unfortunately there are some edge cases where the typical processors and scripts fall short. This article provides an additional pattern for dealing with some of these scenarios.

How metrics collection agents protect against data loss when working with the remote write protocol

This post looks at durability concerns affecting the Prometheus remote write protocol, specifically at how a few different agents attempt to mitigate them.

GCP Cloud Asset Inventory Feed : Get real time notifications on Resource Changes

Although cost monitoring is not typically something we cover here, it’s an increasingly important consideration as companies look to minimize their exposure and tighten the purse strings in a rough economy. This post walks through the steps to alerts on GCP assets and route the notifications to your Slack team.


Monitorama PDX 2023 - June 26-28 (Portland, OR)

Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.

Job Opportunities

Senior Site Reliability Engineer at Beyond (NA Remote)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor