Issue 140

I didn’t make it to Kubecon this week, but hopefully CNCF releases the videos soon. In the meantime, we’ve got a flourish of stories around Node.js observability, production readiness, and scaling challenges during the pandemic. Grab your favorite beverage and enjoy this week’s newsletter! ☕🌄

This issue is sponsored by:

Moogsoft logo

Start incident response with context to all your alerts in one view

Moogsoft speeds up incident response with dynamic anomaly detection, suppressed alert noise, and correlated insights across all your telemetry data. Go from debugging across multiple tools, screens, and dashboards into a single incident view so you and your teams can take a more proactive approach to reduce MTTR. Sign up for the Moogsoft Free community plan today!

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

How we’re building a production readiness review process at Grafana Labs

Production Readiness Reviews feel like one of those things that every company should have, but so few actually do. In my own experience, the fallback is “institutional knowledge” maintained by a few key engineers. Great to see a company not only willing to talk about this publicly, but to actually share their process document.

Scaling in the age of COVID

What do you do when your already-busy internet service is forced to scale up rapidly due to a global pandemic? Do the best you can with what you have, and learn a lot about cascading failures. This one hits close to home.

Monitoring/logging your K8S NodeJS applications with Elasticsearch

This company moved their Node.js applications to Kubernetes, but then realized they were missing some of the observability niceties they’d gotten used to elsewhere. Adopting Elasticsearch, Elastic APM, and Kibana restored a lot of their missing monitoring and logging functionality.

How to debug Node.js applications in Kubernetes?

Conversely, if you’re trying to debug your K8s Node.js app in real-time, you might need to turn to some more unorthodox approaches, including breakpoints in vscode or even the dev console. Some fun examples here.

How Do We Monitor the Picus Infrastructure?

Picus Security talks about the tools and processes (and integrations) they employ to monitor and alert on their production infrastructure.

Periscope — The Kubernetes Monitoring and Tracking Dashboard

OSLabs has released a new dashboard project for monitoring your Kubernetes clusters.

10 Trends in Real-World Container Use

Although this isn’t a monitoring article in the strictest sense, I recognize that trends are a big driver in how we design and operate our systems. Hence, it’s important to at least keep an eye on where the industry is moving so we can start planning for shifts in the observability landscape.

Chronosphere logo

Chronosphere is the only observability platform that puts you back in control by taming rampant data growth and cloud-native complexity, delivering increased business confidence. Teams at enterprises, large cloud-native, and mid-market companies around the world trust Chronosphere to help them operate scalable, highly available, and resilient applications. Learn more here. (SPONSORED)

New in Grafana 8.2: Test contact points for alerts before they fire

A welcome change in Grafana 8.2, you can now send test alerts to new contacts before actually saving the contact. This works through both the UI and the API.

Kafka Metrics Monitoring with Prometheus

A no-nonsense look at collecting Kafka metrics using the Prometheus JMX exporter.

Getting the most out of Open Telemetry with manual instrumentation

Setting up your own manual traces for Node.js with OpenTelemetry and Jaeger.

Tracing AWS Lambdas with OpenTelemetry and Elastic Observability

A very thorough walkthrough (with a sample serverless project) for tracing your AWS lambdas with the Elastic observability stack.

Tools

oslabs-beta/Periscope

“Periscope integrates with a Prometheus server and then displays the core metrics that any engineer needs to understand the state and health of their [Kubernetes] cluster. Engineers can see CPU, disk usage and memory usage across their cluster.”

prometheus/jmx_exporter

“JMX to Prometheus exporter: a collector that can configurably scrape and expose mBeans of a JMX target.”

Events

InfluxDays North America 2021 Virtual Experience

InfluxData is hosting their annual user conference as a free virtual event, taking place October 26-27, 2021.

Job Opportunities

Senior Software Engineer - SRE at Iterable (Remote)

Staff Software Engineer - SRE at Iterable (Remote)

Senior Cloud Engineer at Graylog (Remote)

Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor