An eccentric mashup of monitoring posts this week, with an emphasis on metrics design and collection. And an old man yells at the cloud. šŸ˜‚šŸ“¢ā›…

This issue is sponsored by:

Chronosphere logo

Your on-call holiday survival kit is here.

In the spirit of the holidays, Chronosphere has packaged 4 presents to help Engineering teams march towards reducing stress and avoiding burnout. Put your best foot forward (in style) while moving towards on-call experiences that suck less. Get your kit!



Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, Iā€™m also there, but Iā€™m sure everyone else is way cooler.

From The Community

Making a technical platformā€¦

A very unique look at the evolution (pun intended) of one companyā€™s platform infrastructure, including Observability and related concerns.

Understanding Duplicate Samples and Out-of-order Timestamp Errors in Prometheus

This is a fascinating read on Prometheus out-of-order metrics, particularly if youā€™re a crufty old TSDB admin and former Graphite maintainer who argues this should have been supported(*) years ago. All teasing aside, it really is a very interesting post with plenty of relevant technical details and helpful bits for Prometheus admins.

* I acknowledge that all TSDB authors make compromises relevant to their respective requirements, but after having seen countless ā€œnew hot metrics enginesā€ come and go, it feels inevitable to me that all competing TSDBs eventually settle on roughly the same feature set with the primary differences boiling down to implementation details and a select collection of bugs deemed too difficult to fix. Donā€™t @ me.

Phantom Metrics: Why Your Monitoring Dashboard May Be Lying to You

Iā€™ve been guilty of ā€œmonitoring all the thingsā€ in the past, but we still hear the same question repeated year after yearā€¦ ā€œwhat should I be monitoring?ā€ This post revisits numerous important considerations for metrics design and collection.

k8spacket ā€” are your TLS connections inside the cluster still secure?

Monitoring for TLS versions and ciphers feels like a bit of an edge case, but I have no doubt there are security and compliance engineers in your org right now that would swoon over this.

How to monitor kube-controller-manager

Iā€™ve genuinely enjoyed these monitoring deep-dives on Kubernetes components from Sysdig. Although much of this information is available in the official docs, itā€™s nice to see it aggregated for a specific controller, along with the metrics relevant to their health.

Running the OpenTelemetry Demo App on HashiCorp Nomad

A fun side project for one dev advocate turned into an OpenTelemetry tutorial with a collection of cloud-native tools. Thereā€™s a good chance Iā€™m still working through this as youā€™re reading these words. šŸ˜†

Loop1 logo

Do you need to monitor applications on-premises and in the cloud?

SolarWindsĀ® Server & Application Monitor is designed to monitor your applications and their supporting infrastructure. Get continuous server monitoring, cross-stack correlation for your hybrid IT data, and the flexibility to monitor custom applications. Download a fully-functional 30-day free trial. (SPONSORED)



How to Use SkyWalking for Distributed Tracing in Istio?

A thorough guide for setting up your own distributed tracing infrastructure with Apache SkyWalking to capture observability in an Istio service mesh. Honestly looks like a pretty painless way to get introduced to distributed tracing.

ļøUptime check of external sites & services

Uptime Kuma is one of those handy self-hosted services that nobody really talks about. Weā€™ve covered it once before but it bears a reminder that this OSS project exists and remains a surprisingly competent alternative to paid health-check services.

How to create metrics that really matter?

A simple but relevant strategy for informing engineersā€™ choice of metrics instrumentation in their apps.

Grafana releases: New 2023 release schedule

Grafana Labs is adopting a monthly release cycle for the next year of Grafana releases. I donā€™t think this will necessarily impact users (from my experience, most folks update irregularly based on security or desired feature releases) as much as stabilize their internal processes, but itā€™s still good to see them set expectations within the community.

Tools

k8spacket/k8spacket

ā€œpackets traffic visualization for kubernetesā€

Events

Monitorama PDX 2023 - June 26-28 (Portland, OR)

Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.

See you next week!

ā€“ Jason (@obfuscurity) Monitoring Weekly Editor