A smorgasbord of technical articles this week, including an eBPF tracer tutorial, a look at metrics forecasting, and a deep-dive with Netflix engineers diagnosing some missing performance. Enjoy! 🔥📈🔔

This issue is sponsored by:

DataSet logo

Are you looking to modernize Log Analytics while controlling the cost?

DataSet is the cloud-native event data platform that enables teams to achieve petabytes of effortless scalability and real-time performance at a fraction of the cost. Get complete visibility into your entire stack and experience the DataSet difference for free.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

It’s been amazing to see the community continue to grow. We’d love to have you join us and share what you’ve been working on.

From The Community

A Practical Guide to Capturing Production Traffic With EBPF

An excellent guide for creating eBPF-based protocol tracers to inspect your HTTP traffic. If you’re new to eBPF, this feels like a great hands-on lab for getting started.

Seeing through hardware counters: a journey to threefold performance increase

You have to tip your hat to Netflix engineers for this super detailed explanation of their search for missing performance after a hardware upgrade. I love this so much.

What if you never had to tweak your alerting thresholds for your metrics?

At some point I think forecasting got intertwined with data smoothing and became a “bad word” in our industry. Personally, I miss some of the unique statistical applications we used to read about and try in our own systems. Nice to see this topic resurface for teams interested in dynamic alerting applications.

Rebuilding Threat Detection and Incident Response at LinkedIn

As the worlds of Security and Observability continue to overlap, we’re seeing the continued integration of monitoring data and services within the security and incident response workflow. LinkedIn provides a detailed look at their own concerns and how these tools and technologies came together to form their software-defined Security Operations Center.

Using Prometheus to scrape temperature and humidity at your home

I always enjoy reading about the intersection of open source software and home automation. There’s something about using it in your home that makes it feel that much more meaningful to me.

Know the three-phase approach to Observability

A phased plan for introducing observability into your organization. Pretty high-level but still a reasonable model for anyone starting on the journey.

How to NOT miss Cloudwatch Alarms

How an engineer tackled the shortcomings of repeated alarm notifications for AWS CloudWatch, developing their own pattern for anyone to use.

How to monitor etcd

If you run etcd, make sure you check out this thorough look at the service and the metrics you should be monitoring for it in production.

Security release: New versions of Grafana with critical and moderate CVE fixes

Security fixes for Grafana versions prior to 9.2.4 and 8.5.15. Start your upgrade engines…

Using PostgreSQL as a Scalable, Durable, and Reliable Storage for Jaeger Tracing

I’ve been a fan of Jaeger’s support for external storage (e.g. ClickHouse) using their gRPC plugin. The folks at Promscale worked with the Jaeger team to develop their own PostgreSQL-based version for Promscale.


Monitorama PDX 2023 - June 26-28 (Portland, OR)

Monitorama is returning to Portland, OR next summer. The 2022 conference was a fantastic event and I look forward to seeing you all again in 2023.

Job Opportunities

Infrastructure Ops Engineering at Fly (Remote)

Software Engineer, Resilience at Datadog (Remote)

Senior Site Reliability Engineer at Mozilla (NA Remote)

Sr. Software Engineer, Network Engineering at SeatGeek (US Remote)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor