This week reminds me of what got me excited about monitoring in the first place… building reliable, scaleable systems with the visibility and knowledge to maintain them. Some fun topics around time-series, alerting, and Kubernetes troubleshooting. Enjoy! 🎈🥂🌻
This issue is sponsored by:
Headed to Portland for Monitorama 2023 PDX? We sure are! Chronosphere’s Co-founder and CTO, Rob Skillington, will be speaking about cost-efficient metrics aggregation on Monday, June 26! Come check out his session, grab some swag, and enter for a chance to win a Mighty Bowser™ LEGO set! See what other activities we’re up to that week here.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
As a fan of push-based metrics collection, I’m not sure I buy into the rhetoric here, but this is a very good look at Prometheus’ strengths and how to use its multitude of features.
Anyone who’s tried to alert on complex time-series queries can empathize with this one. More proof that Kyle Kingsbury’s Riemann was ahead of its time.
Kubernetes does a good job of managing resources, but it’s naive to think we won’t need to troubleshoot it like any other system from time to time. And knowing how to debug something is the first step to monitoring it effectively.
Golang developers rejoice, the SkyWalking project has released a new auto-instrumenting agent specifically for Go applications. Looks like the older go2sky project will be deprecated in the not-too-distant future.
Some foundational tips and techniques for Kubernetes debugging that should inform your monitoring strategy.
Interesting discussion on the balance we strive for when building any reliable system. I’d posit that any team revisits this numerous times over a company’s growth.
People, Process, Technology - How has your business changed?
The 2nd Annual State of Availability survey is out and we want to hear from you. Tell us how your business has changed over the last year around ITOps, DevOps & AIOps. Survey respondents will be entered to win a $100 Amazon Gift Card. (SPONSORED)
If you’re looking to minimize your Prometheus retention but need to support longer ranges on Thanos queries, you might want to check out the Ruler component. This post is a quick anecdotal look at one company’s need for it in lieu of evolving retention demands.
We often focus on the processes and responsibilities during an incident response, but we often neglect how we communicate can have a tremendous effect on our peers and customers.
Security patch releases for Grafana have been released to address medium and high severity CVE advisories.
“The Golang auto-instrument Agent for Apache SkyWalking, which provides the native tracing abilities for Golang projects.”
Just two weeks left until everyone’s favorite monitoring conference of the year. I’m super excited to see everyone back in Portland for another awesome lineup of speakers and plenty of fun activities. Hope to see you there!
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor