Issue 109

This issue is sponsored by:

SignalFx logo Serverless has grown in popularity with application developers because it abstracts away the pesky operations pieces of development. Download SignalFx’s Definitive Guide to Serverless Monitoring and Observability and learn from dozens of SignalFx customers using serverless in production.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

We had some fantastic discussion about what “legacy” really means in infrastructure this week. You should come hang out and contribute to the fun conversations!

Understanding Observability (and Monitoring) with Christine Yen

Monitoring and observability is something near-and-dear to my own heart, so this week’s episode is exciting: Christine Yen, Cofounder & CEO of Honeycomb, joins me to talk about observability, why dashboards aren’t as helpful as you think, and the value of being able to ask questions of your own application and infrastructure when you’re troubleshooting.

From The Community

Exploring Istio telemetry and observability

Service meshes are the new hotness, right? I’m not even gonna pretend to understand what’s going on here, but I know at least a few of you will enjoy this article.

Log to Elasticsearch using curl

This is surprisingly more involved than I expected, but certainly useful.

Observability should not slow you down

I love reading stories of how teams decide on their new monitoring tool(s). While the outcome isn’t surprising with this one, the reasoning they used to get there might be useful to many other teams.

Summary: The Tyranny of Metrics

The (article) author’s summary is great and this sounds like an interesting book: “While metrics can and do offer extraordinary benefits when they’re used carefully and properly, there is significant chance of them being chosen incorrectly, being gamed and corrupted by various parties, and ultimately becoming toxic to the very cause they were created to help.” They also wrote a useful piece on Examples of Bad Metrics. (note that this is not system/app metrics, but much broader and useful for KPI discussions)

The state of system observability with BPF

This coming from the man once known for screaming at disk arrays: “Gregg started with a demonstration tool that he had just written: it’s immediate manifestation was in the creation of a high-pitched tone that varied in frequency as he walked around the lectern. It was, it turns out, a BPF-based tool that extracts the signal strength of the laptop’s WiFi connection from the kernel and creates a noise in response. As he interfered with that signal with his body, the strength (and thus the pitch of the tone) varied. By tethering the laptop to his phone, he used the tool to measure how close he was to the laptop.” So cool <3

Practical Metrics with Graphite and Terraform (Part 2)

Part 2 continues with some of their thoughts on Graphite cluster design using go-graphite, statsd, Grafana, and DNS round-robin. Catch up on Part 1 here.

healthchecks/healthchecks: A Cron Monitoring Tool written in Python & Django

Oddly, monitoring absence-of-information things like recurring jobs/crons isn’t commonly found in monitoring tools for whatever reason. For those doing your monitoring with self-hosted tools, this looks like it’ll be handy.

cloudmarker/cloudmarker: Cloud monitoring tool and framework

It’s sort of like Cloud Custodian, except built for Azure (and GCP was added later). Interestingly, no AWS support.

Centralised Logging for Lambda@Edge

This just sounds like an awful problem: “This is due to the fact that Lambda@Edge utilises CloudFront edge locations to distribute your Lambda functions across the whole AWS world, and when a user accesses your CloudFront content, the logs will appear in the region that’s closest to that edge location, and not in us-east-1 where Lambda@Edge is deployed! The logs are sometimes difficult to find, particularly when you have no idea where the user having issues made the request from, or when you’re using all edge location provided by CloudFront service.”

Be discerning in what dashboards you share with users

More common of a problem than you would think, and I’m sure everyone reading this has a similar story to tell.

Worth a Look: Public Grafana Dashboards

I’ve linked to a few of these before, but there’s some new ones I didn’t know about. They might give you some ideas for dashboard organization.

Solr Key Metrics to Monitor and Solr Open Source Monitoring Tools

The folks at SemaText are back again, and giving Datadog’s content team a run for their money. Got some Solr laying around? Maybe this will help.

This issue is sponsored by:

Raygun logo Raygun’s Continuous Delivery Process**

The folks at Raygun have some thoughts on CI/CD, but rather than bore you with product news, they wrote an article about how they do CI/CD for Raygun itself. Check it out.

Events

Monitorama Baltimore 2019 - October 21-22, 2019 - Baltimore, MD USA

Yes, you read that right: Monitorama is doing a new event on the American east coast! I’m super excited.

Monitoring & AIOps Meetup - May 22nd, 2019 - Mountain View, CA USA

I’m emceeing this meetup in Mountain View later next week, with two awesome speakers: Stefan Apitz and J Paul Reed. You should come out for it!

Icinga community meetup - June 12, 2019 - The Hague, Netherlands

It’s the very first meetup for this new group, so if you’re in the area and like Icinga, you should be sure to go.

See you next week!

– Mike (@mike_julian) Monitoring Weekly Editor