Issue 173

This week’s recurring theme was on Incident Management and how to do it sustainably, along with a number of articles on Prometheus and open source tooling. Now if you’ll excuse me, I’m going to try that OpenTelemetry demo and see about finally instrumenting my apps for tracing. ⏰📈🍿

This issue is sponsored by:

Lumigo logo

The Plug-and-Debug Serverless Observability Platform

Trouble locating bugs in your serverless environment? Quit wasting precious development time and get an end-to-end map of your services in just four minutes with 1-click distributed tracing. Navigate your serverless chaos seamlessly—with Lumigo.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

On monitoring from a (slightly) different point of view

A refreshing look at monitoring tooling from the perspective of an application engineer.

Monitor your applications on Google Managed Prometheus

If you’re considering Google’s managed Prometheus service, you owe it to yourself to check out this very thorough tutorial.

Incident Management Guide

There are a lot of companies out there who don’t have a comprehensive policy for incident response. I’ve paged (pun intended) through the first few chapters of this guide and came away really impressed. Props to the author(s) for keeping the vendor pitches to a bare minimum.

Modern tools for a Modern Workforce

Remember when we used to manage and name (gasp) our servers? This post feels like a bit of a throwback to days of yore, walking the user through some basic Linux administration and troubleshooting tasks before a very detailed look at setting up Prometheus and the various exporters you’ll need for any modern Linux system.

Observability Mythbusters: How hard is it to get started with OpenTelemetry

This might be the most approachable way of learning OpenTelemetry that I’ve seen to date. I like that they use the OpenTelemetry Community Demo Application as the demo service for this example.

Monitorama 2022 - OpenTelemetry and so much more!

Leon might be the only person looking forward to Monitorama PDX 2022 more than me (although I’d wager I’m still more anxious). Still, I appreciate that so many folks are looking forward to it and that he thought to write this sneak peek at what we can expect from the event.

Chronosphere logo

Now more than ever, we need monitoring and observability built for the cloud native world. The new O’Reilly Media report addresses practical challenges and solutions for modern architecture, highlighting the roles of observability and metrics, how to harness growing metric data, and the nuts and bolts of great metrics functions. Download your copy today! (SPONSORED)

Grafana 9.0 announcement

Last week was a big week for Grafana Labs, announcing their latest Grafana 9.0 release at GrafanaCONline (that’s a lot of Grafana). I’m cautiously optimistic in what they’re trying to do with the new visual query builder for Prometheus (having a bit of experience working with time-series UIs myself). Still, I’m anxious to see how they evolve this functionality going forward.

Tracking On-Call Health

A look at how Honeycomb tracks their own internal on-call operational health, and the steps they’re taking to improve it.

Is It Time For You to Adopt Managed Monitoring?

Another look at the classic build-versus-buy decision for observability and monitoring resources.

Setup Prometheus, Kube State metrics and Integrate Grafana with Kubernetes

A detailed two-part series on using Prometheus with kube-state-metrics (KSM) to monitor your Kubernetes cluster.

Introducing Grafana OnCall OSS, on-call management for the open source community

Another announcement from Grafana last week, they’ve released an open source, self-hosted version of their on-call project originally released to Grafana Cloud earlier this year.

Tools

grafana/oncall

“Developer-friendly incident response with Slack integration.”

kubernetes/kube-state-metrics

“kube-state-metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects.”

Events

Monitorama PDX 2022 - June 27-29 (Portland, OR)

I probably don’t need to remind you, but Monitorama is coming up in just over a week. I’m going to be there with the entire family and I can’t wait to see a bunch of friendly (masked) faces. We have a fantastic lineup of speakers and some fun activities planned. There are still a few dozen tickets remaining if you’re in the area and would like to join us.

Job Opportunities

Infrastructure Engineer at CompanyCam (NA Remote)

Senior Software Engineer - SRE at Barracuda (US Remote)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor