This week brings us a lot of variety, from controlling metrics costs to logging challenges and plenty more. If you use Alertmanager in production, make sure to check out the example for integrating it with n8n. Enjoy! 😍☕🏂

This issue is sponsored by:

Firehydrant logo

Structure your process with a single source of truth and configurable step-by-step Runbooks. Automate declaration, assembly, and communication to move faster and more uniformly. Improve your systems with insights from incident analytics for true reliability gains. Get started for free or book a demo at

Articles & News on

Observability & Monitoring Community Slack

It’s an awesome accomplishment to make it to Issue number 200 and I couldn’t have done it without all of you. I hope you’ll join us in the community Slack and share what you’ve been working on.

From The Community

eBPF enhanced HTTP observability - L7 metrics and tracing

Another excellent technical post from the Apache SkyWalking project, this time with a look at layer 7 observability via eBPF.

Navigating the Storm: Strategies for Managing Production Incidents

This article covers a lot of relavant topics and considerations for effective incident response. Although it’s not super detailed on any specific area, it’s a great jumping off point for any engineer curious about the first steps towards better incident management.

Build an end to end JSON logging system for clients apps

This story from Pinterest engineers is the kind of thing that sounds reasonable but in reality is really easy to do poorly. OTOH if you’re only concerned about the frequency of a specific event, I think there are easier ways of accomplishing this (e.g. Vector).

Watch: 5 tips for improving Grafana Loki query performance

Are you a Loki power user? You might be after this collection of tips from Grafana Labs.

How I’m using Grafana and Prometheus to Monitor MySQL Server

This engineer walks us through their experience setting up monitoring for their MySQL server with Prometheus, including the metrics that matter to them for reliability and performance.

Alertmanager incident response automation with n8n

An interesting look at automating incident response actions for Alertmanager using the n8n project. I love the flexibility this offers, particularly since n8n can be self-hosted.

Chronosphere logo

As we enter 2023, the cloud native revolution is taking hold. Tune in for a webinar where we discuss findings from the 2023 Cloud Native Observability Report that surveyed engineers and software developers on ways cloud native complexity makes their jobs harder and the hours longer. Register now. (SPONSORED)

Expensive Metrics: Why Your Monitoring Data and Bill Get Out Of Hand

Given the realities facing tech employers these days, the cost of monitoring data feels painfully relevant. This post covers some of the most likely culprits for metrics invoice bloat.

Monitoring microservices

A look at Criteo’s internal observability practices now and their plans for the future.

Controlling SLO for your Kafka Cluster

If you manage any Kafka clusters or work with someone who does, check out this project that exposes your Kafka SLO as a Prometheus exporter.

Effectively measuring execution times with Micrometer & DataDog

Micrometer looks like a powerful and flexible library for monitoring your Java apps with the advantage of being vendor agnostic. A great example for performance monitoring your services with distributions on a service like Datadog.



Caretta is a lightweight, standalone tool that instantly creates a visual network map of the services running in your cluster.


An application metrics facade for the most popular monitoring tools. Think SLF4J, but for metrics.


Kafka monitoring is a service written in golang who will help you to monitoring and calculate your SLO over your kafka cluster.


CFP open for Monitorama PDX 2023

There are only a few weeks left to submit talks for this year’s Monitorama PDX 2023 event. Get your proposal in before the Feb 3 deadline! 🤩

Job Opportunities

Senior Software Engineer - GX Open Source at Great Expectations (US Remote)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor