Issue 197
An eccentric mashup of monitoring posts this week, with an emphasis on metrics design and collection. And an old man yells at the cloud. šš¢ā
This issue is sponsored by:
Your on-call holiday survival kit is here.
In the spirit of the holidays, Chronosphere has packaged 4 presents to help Engineering teams march towards reducing stress and avoiding burnout. Put your best foot forward (in style) while moving towards on-call experiences that suck less. Get your kit!
Articles & News on monitoring.love
Observability & Monitoring Community Slack
Come hang out with all your fellow Monitoring Weekly readers. I mean, Iām also there, but Iām sure everyone else is way cooler.
From The Community
Making a technical platformā¦
A very unique look at the evolution (pun intended) of one companyās platform infrastructure, including Observability and related concerns.
Understanding Duplicate Samples and Out-of-order Timestamp Errors in Prometheus
This is a fascinating read on Prometheus out-of-order metrics, particularly if youāre a crufty old TSDB admin and former Graphite maintainer who argues this should have been supported(*) years ago. All teasing aside, it really is a very interesting post with plenty of relevant technical details and helpful bits for Prometheus admins.
* I acknowledge that all TSDB authors make compromises relevant to their respective requirements, but after having seen countless ānew hot metrics enginesā come and go, it feels inevitable to me that all competing TSDBs eventually settle on roughly the same feature set with the primary differences boiling down to implementation details and a select collection of bugs deemed too difficult to fix. Donāt @ me.
Phantom Metrics: Why Your Monitoring Dashboard May Be Lying to You
Iāve been guilty of āmonitoring all the thingsā in the past, but we still hear the same question repeated year after yearā¦ āwhat should I be monitoring?ā This post revisits numerous important considerations for metrics design and collection.
k8spacket ā are your TLS connections inside the cluster still secure?
Monitoring for TLS versions and ciphers feels like a bit of an edge case, but I have no doubt there are security and compliance engineers in your org right now that would swoon over this.
How to monitor kube-controller-manager
Iāve genuinely enjoyed these monitoring deep-dives on Kubernetes components from Sysdig. Although much of this information is available in the official docs, itās nice to see it aggregated for a specific controller, along with the metrics relevant to their health.
Running the OpenTelemetry Demo App on HashiCorp Nomad
A fun side project for one dev advocate turned into an OpenTelemetry tutorial with a collection of cloud-native tools. Thereās a good chance Iām still working through this as youāre reading these words. š
Do you need to monitor applications on-premises and in the cloud?
SolarWindsĀ® Server & Application Monitor is designed to monitor your applications and their supporting infrastructure. Get continuous server monitoring, cross-stack correlation for your hybrid IT data, and the flexibility to monitor custom applications. Download a fully-functional 30-day free trial. (SPONSORED)
How to Use SkyWalking for Distributed Tracing in Istio?
A thorough guide for setting up your own distributed tracing infrastructure with Apache SkyWalking to capture observability in an Istio service mesh. Honestly looks like a pretty painless way to get introduced to distributed tracing.
ļøUptime check of external sites & services
Uptime Kuma is one of those handy self-hosted services that nobody really talks about. Weāve covered it once before but it bears a reminder that this OSS project exists and remains a surprisingly competent alternative to paid health-check services.
How to create metrics that really matter?
A simple but relevant strategy for informing engineersā choice of metrics instrumentation in their apps.
Grafana releases: New 2023 release schedule
Grafana Labs is adopting a monthly release cycle for the next year of Grafana releases. I donāt think this will necessarily impact users (from my experience, most folks update irregularly based on security or desired feature releases) as much as stabilize their internal processes, but itās still good to see them set expectations within the community.
Tools
āpackets traffic visualization for kubernetesā
Events
Monitorama PDX 2023 - June 26-28 (Portland, OR)
Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.
See you next week!
ā Jason (@obfuscurity) Monitoring Weekly Editor