Issue 041
Hey folks, welcome to another installment of Monitoring Weekly! Did you write something about monitoring recently? Maybe got an idea rolling around in your head? Send it on over and let the community learn from you. :D
Monitoring News, Articles, and Blog posts
Building a Distributed Log from Scratch, Part 3: Scaling Message Delivery
Part 3 of this series I linked to last week is up. Part 1 & 2 were theory and basic implementation, while Part 3 takes us into the realm of distributed logging for non-trivial workloads (aka, most of you running production sites).
The trend of market consolidation continues, this time with SolarWinds acquiring Loggly. You may know SolarWinds from their enterprise software (Solarwinds NPM, among many, many others) and extraordinarily-annoying enterprise salespeople. However, for the last few years they’ve been buying up the likes of Pingdom, Papertrail, Librato, Scout, and now Loggly–all in pursuit of establishing Solarwinds Cloud as a major player in cloud monitoring. Buying Loggly is interesting, given that they already have Papertrail. While Papertrail covers the downmarket segment, this acquisition of Loggly allows them to capture the midmarket and upmarket segments as well.
Google Cloud Platform Blog: Consequences of SLO violations
What happens when an SLO is is violated at Google? There’s several things, including the last-resort many of us have heard about it (revoking SRE support), but there’s plenty more useful actions to take ahead of that. Especially important, since most of us in Ops can’t just tell our dev teams we’re not helping them anymore.
Puppet Server Monitoring with the ELK Stack - Part 1
How to get your Puppet converge logs into ELK. Not really much more to say than that. :)
Monitoring Portal Discussion Forums
Some folks from the Icinga community have started a monitoring discussion forum! It’s pretty heavily geared toward Icinga/OMD right now, but maybe we can show them some love and expand it.
This Is What it Takes to Measure the Internet
Someone has been pinging every internet-reachable device on earth since 2006. The visualization is pretty neat, and especially notable is the continued outage in San Juan, Puerto Rico. There’s a whole bunch of information, including raw datasets, at the ANT homepage.
Influx/Days 2017 San Francisco (videos)
The recordings from InfluxDays 2017 SF are up now. If you missed the event, now’s your chance to hear all the talks.
See you next week!
– Mike (@mike_julian) Monitoring Weekly Editor