A very eclectic mix of articles this week, including some topical discussions on holiday surge preparation and on-call participation. Oh, and speaking of surges, we have another big wave of relevant job postings below. Enjoy!

This issue is sponsored by:

LogicMonitor logo

Work. Without the hard work.

LogicMonitor empowers teams to spend less time troubleshooting and more time innovating with fully automated infrastructure monitoring and log analysis. AI-powered intelligence automatically detects monitoring resources, surfaces anomalies, and provides root cause analysis across your entire stack. Leave the manual configuration, expensive hardware, and long hours of troubleshooting behind with a free trial of LogicMonitor.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

6 Steps SREs Should Take to Prepare for Black Friday and Cyber Monday 2021

Some last minute tips for SREs leading up to the chaos of a busy holiday season. Frankly, these are achievable goals that should already be in your planning, but it’s better late than never.

I Don’t Want to Be On Call Anymore. Am I a Monster?

Being on-call is rarely fun, and many companies struggle to do it in a sustainable manner. But when is enough, enough? A tough-but-honest conversation about doing on-call better, or not at all.

Using OpenTelemetry auto-instrumentation/agents in Kubernetes

OpenTelemetry Operator now supports auto-instrumentation in Kubernetes “when an Instrumentation CR is present in the cluster and a namespace or workload is annotated”. This sounds fantastic, and I’m anxious to see support for more languages (Java, NodeJS, and Python are currently supported) added soon.

Monitor Kubernetes pod status from a Jenkins pipeline

This isn’t your typical monitoring article (it isn’t even really written as such), but it reminds me that there’s still a functional gap in how we monitor deployments with traditional Release Engineering tools. Admittedly, cardinality remains a problem for these types of short-lived jobs, but surely there’s a better way?

Docker Container Monitoring with Different Options — ZABBIX/ELK/Prometheus

An objective comparison of container monitoring with Zabbix, ELK, and Prometheus.

Enterprise monitoring using Amazon Managed Prometheus and Grafana

An unexpectedly thorough write-up of one company’s transition to Amazon-managed Prometheus and Grafana. The author does a great job explaining why they chose to follow this path, as well as the drawbacks you’ll encounter.

Smoking a Turkey with Prometheus, Home Assistant, and Grafana

A fun project for connecting a standard (non-IoT) grill to sensors and open source observability tools. This sort of thing is exactly why I’ve been nagging Camp Chef for access to the API for my networked pellet grill.

Raygun logo

Improve your Core Web Vitals with this Definitive Guide

Pave the way to Core Web Vitals nirvana using this in-depth guide for developers. Learn actionable tips, best-practice advice, and a proven workflow to boost your scores, bolster your Google search ranking and enhance your end-user experience. Check out the Developer's Guide to Core Web Vitals today. (SPONSORED)

Nvidia GPU to Pod metrics via Grafana

I don’t know how many of you are running NVIDIA GPU clusters, but if you are, this article may prove useful for tracking utilization metrics.

Advanced Monika alert queries: A guide to operators and helpers

How to use Monika’s operators and helpers for more complex alerting possibilities.

Platform Monitoring — First concepts

This article covers a lot of familiar concepts, but it’s still a good read for folks who might otherwise be new to observability or monitoring.

Scrape RabbitMQ Metrics With Prometheus in Kubernetes

A quick example for monitoring RabbitMQ in Kubernetes without relying on additional external plugins or libraries.



The OpenTelemetry Operator is an implementation of a Kubernetes Operator.

Job Opportunities

DevOps / Site Reliability Engineer at Leadfeeder (Remote)

Backend Engineer - Observability Infrastructure at Spotify (US, NYC)

Senior Observability Tools Engineer at Timescale (Remote)

Senior Database/SQL Engineer, Observability at Timescale (Remote)

DevOps Engineer at Mandiant (Remote)

Negotiating your AWS contract? Let us help. At The Duckbill Group, we’re on your side and we see dozens of these a year–more than most AWS account managers! We’ve helped negotiate everything from $3mm contracts to $650mm contracts and a whole slew in between. Check out our AWS contract negotiation services. (SPONSORED)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor