Issue 118

Apologies for the late issue this week, but better late than never, amirite?!

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Designing a Metrics & Functions System for Monitoring, Machine Learning & AIOps

If you’re one of the many people on this list building monitoring SaaS apps, this will be an interesting read for you. The author talks about some of the challenges in building a monitoring service he’s facing with his own company, Siglos.

SignalFx logo Container Monitoring and Observability: Challenges and Strategies

Container technologies are present in most every infrastructure stack today because they deliver on the promise of more nimble and resilient application development. Containers are helping to accelerate application delivery, but they have also increased operational complexity and risk. Read this Container Monitoring and Observability Guide to learn best practices for operating and monitoring containers at scale. (SPONSORED)

Prometheus + AlertManager

A howto guide on setting up Prometheus and a metrics sidecar for use with Istio’s service mesh.

Beyond CSAT: Choosing (and using) the right metrics for your customer service team

Not quite systems/application monitoring, but you know I’m a big proponent of understanding how the entire business functions, so here you go. I particularly like the discussion around identification of metrics and mapping those to business impact metrics.

Streaming Log Analytics with Kafka

“Kresten Thorup discusses how and why they use Kafka internally and demos how they utilize it as a straightforward event-sourcing model for distributed deployments. He presents customer cases on utilizing Kafka to manage and buffer massive volumes of data ingest.”

**[The Cardinality Challenge in Monitoring

Logz.io](https://logz.io/blog/cardinality-challenge-in-monitoring/)**

There’s a lot of talk around this “cardinality” stuff the past couple of years. Still confused? Here’s a good article to help clear things up.

Targeted Diagnostic Logging in Production · Terse Systems

This is a super interesting artical. “Diagnostic logging is typically not available in production, because of concerns that logging information at DEBUG level is indiscriminate. This blog post shows how to combine diagnostic logging with feature flag management to provide targeted debug information in production only for specific groups, users, or sessions.”

piotrmurach/tty-logger: A readable, structured and beautiful logging for the terminal

Exactly what it says, and it looks neat. Ruby only, and built specifically for the TTY toolkit.

Monitoring at eBay with Druid

I missed this article when it came out back in May somehow. The folks at eBay talk through their monitoring re-architecture from homegrown systems to an Apache Druid-based architecture.

Intro to Distributed Tracing – James Turnbull

I pretty much agree with James on this: “There’s still a strong sense that tracing is an enormous investment with potentially limited returns for many organizations. I tend to broadly agree but think we’re making some progress forward. I also think that, as a tool for engineers debugging, distributed tracing is becoming more an inevitability than an option.”

LightStep logo LightStep is one of the new breed of tools out there I’m excited about. Designed with modern, high-scale, high-traffic architectures in mind, LightStep makes it easy to spot, diagnose, and solve performance issues. Check it out here. (SPONSORED)

Loki’s Path to GA: Adding Structure to Unstructured Logs

“From the beginning, one of the first and probably most requested feature sets for Loki has been around a common thread: manipulating log lines. There are many use cases for this, including extracting labels, extracting metrics, setting a timestamp from the log content, or manipulating the log line before it is sent to Loki. In this post we will talk about our approach to solving this problem.”

Events

Sensu Summit 2019 - September 9-10, 2019 - Portland, OR USA

Sensu Summit is one of my favorite events of the year (second only to Monitorama, Jason! <3) and it’s coming up soon. Bonus: my business partner and friend, the (in)famous Corey Quinn, will be speaking. The folks at Sensu have always loved the Monitoring Weekly community, so they’re offering $100 off the ticket price. Just click this link to automatically have your discount applied.

Monitorama Baltimore 2019 - October 21-22, 2019 - Baltimore, MD USA

Sorry Sensu folk, but Monitorama will always be my first love. Turns out, my love can be bought too: there’s a discount code for Monitoring Weekly readers available for $50 off your ticket by clicking this link.

See you next week!

– Mike (@mike_julian) Monitoring Weekly Editor