Some great articles this week, with an emphasis on design patterns, reliability, and some new-to-me tools that I’m hoping to try out this week. Enjoy! ☕📈💖
Elastic Stack 8.0 and 10 years of Elastic are coming to you
The next major release of the Elastic Stack is coming soon and it's also time to look back at 10 years of Elastic with the founders. Learn more about both topics and deep dives into observability this Friday, February 11, at ElasticCC, the free technical community conference from Elastic. Sign up today!
Articles & News on monitoring.love
It’s been amazing to see the community grow throughout 2021 and into 2022. We’d love to have you join us and share what you’ve been working on.
From The Community
Reliability means something different to every company, but it’s critical to have a shared understanding of what that means. This manifesto from Delivery Hero is a fantastic example of how to drive consensus and set expectations among your engineering teams.
An insightful look at the bare minimum of metrics that service owners at Salesforce are expected to collect and monitor.
Scaling systems is the kind of challenge that most of us live for, but it takes experience to learn the pitfalls and patterns that save us time and money the next time around. It should be no surprise that so many of these considerations overlap with the observability domain.
Back in 2020, SoundCloud announced the release of Periskop, an exception handling service modeled after Prometheus’ pull model. They’ve posted an update detailing their progress with Periskop along with a list of planned features.
I’ve never heard of this before, but I kind of love it now. Feels like logarithmic scale for your X axis. ⌚🧙
How to set up the Kiali console for managing and gaining observability over your Istio service mesh.
How Uber leverages their observability data as part of a larger system to help identify fraudulent activity.
The SQL-powered observability backend
Analyze Prometheus metrics and OpenTelemetry traces together using Promscale + the power of SQL. Promscale is open source and built on top of PostgreSQL/TimescaleDB. Get the system insights you need with the technology you’re familiar with. Learn more. (SPONSORED)
A collection of best practices and principles for managing communications during and after an incident.
If you’re a Postgres administrator (or work with one), you probably know how important it is to keep an eye on your WAL activities. Here are some really handy queries for trending the health of your database.
A quick fix for minimizing cAdvisor’s CPU impact on your clusters.
“Kiali provides answers to the questions: What microservices are part of my Istio service mesh and how are they connected?”
“This is the router (or relay, or reverse-proxy) for Graphite. It routes incoming records according to the specified rules. The Nanotube is designed for high-load systems. It is used at Booking.com to route up to a million incoming records/sec on a single box with a typical production config.”
“Pull based, language agnostic exception aggregator for microservice environments.”
Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor