Hope you’re all doing well as we begin (or continue into) the seasonal holidays. Plenty of great articles this week covering monitoring and incident response topics. Enjoy! ⛄💗🐧
This issue is sponsored by:
Are you looking to modernize Log Analytics while controlling the cost?
DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. Get complete visibility into your entire stack and experience the DataSet difference for free.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
A helpful pattern (and free script) for monitoring a Docker Swarm stack with Prometheus.
Another Mastodon-related post this week, this time with a deeper dive on Hachyderm.io observability as they continue to deal with the influx of new users and scaling challenges.
An excellent guide for understanding the Kubelet’s role and how to monitor it effectively.
Lex Neva (a former peer and someone I respect greatly) shares some important considerations for any healthy Incident Retrospective process.
Go is a hugely popular programming language, but it’s also susceptible to a variety of goroutine memory leaks. This post from Uber details their in-house leak indicator and demonstrates how they use it to help diagnose a variety of leak types.
If you’re looking to index your AWS ALB logs from S3 to Elasticsearch, this engineer presents a solution which they claim scales better than Logstash and without duplicate records.
How do you stack up in terms of monitoring maturity?
Take the Loop1 L1M3 calculator and gain insight into your current IT Operations maturity level and the steps you should take to shift the needle toward insightful, where IT helps to drive business outcomes via effective analytics and business intelligence. (SPONSORED)
It can be super helpful to add custom tags to your logs. Unfortunately there are some edge cases where the typical processors and scripts fall short. This article provides an additional pattern for dealing with some of these scenarios.
This post looks at durability concerns affecting the Prometheus remote write protocol, specifically at how a few different agents attempt to mitigate them.
Although cost monitoring is not typically something we cover here, it’s an increasingly important consideration as companies look to minimize their exposure and tighten the purse strings in a rough economy. This post walks through the steps to alerts on GCP assets and route the notifications to your Slack team.
Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor