This week brings us a lot of variety, from controlling metrics costs to logging challenges and plenty more. If you use Alertmanager in production, make sure to check out the example for integrating it with n8n. Enjoy! 😍☕🏂
This issue is sponsored by:
Structure your process with a single source of truth and configurable step-by-step Runbooks. Automate declaration, assembly, and communication to move faster and more uniformly. Improve your systems with insights from incident analytics for true reliability gains. Get started for free or book a demo at www.firehydrant.com.
Articles & News on monitoring.love
It’s an awesome accomplishment to make it to Issue number 200 and I couldn’t have done it without all of you. I hope you’ll join us in the community Slack and share what you’ve been working on.
From The Community
Another excellent technical post from the Apache SkyWalking project, this time with a look at layer 7 observability via eBPF.
This article covers a lot of relavant topics and considerations for effective incident response. Although it’s not super detailed on any specific area, it’s a great jumping off point for any engineer curious about the first steps towards better incident management.
This story from Pinterest engineers is the kind of thing that sounds reasonable but in reality is really easy to do poorly. OTOH if you’re only concerned about the frequency of a specific event, I think there are easier ways of accomplishing this (e.g. Vector).
Are you a Loki power user? You might be after this collection of tips from Grafana Labs.
This engineer walks us through their experience setting up monitoring for their MySQL server with Prometheus, including the metrics that matter to them for reliability and performance.
An interesting look at automating incident response actions for Alertmanager using the n8n project. I love the flexibility this offers, particularly since n8n can be self-hosted.
As we enter 2023, the cloud native revolution is taking hold. Tune in for a webinar where we discuss findings from the 2023 Cloud Native Observability Report that surveyed engineers and software developers on ways cloud native complexity makes their jobs harder and the hours longer. Register now. (SPONSORED)
Given the realities facing tech employers these days, the cost of monitoring data feels painfully relevant. This post covers some of the most likely culprits for metrics invoice bloat.
A look at Criteo’s internal observability practices now and their plans for the future.
If you manage any Kafka clusters or work with someone who does, check out this project that exposes your Kafka SLO as a Prometheus exporter.
Micrometer looks like a powerful and flexible library for monitoring your Java apps with the advantage of being vendor agnostic. A great example for performance monitoring your services with distributions on a service like Datadog.
“Caretta is a lightweight, standalone tool that instantly creates a visual network map of the services running in your cluster.”
“An application metrics facade for the most popular monitoring tools. Think SLF4J, but for metrics.”
“Kafka monitoring is a service written in golang who will help you to monitoring and calculate your SLO over your kafka cluster.”
There are only a few weeks left to submit talks for this year’s Monitorama PDX 2023 event. Get your proposal in before the Feb 3 deadline! 🤩
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor