This week’s issue is heavy on AWS, serverless, and time series database topics. Looks like hiring for remote engineers is still going strong, with numerous job postings at the bottom. Enjoy! 😎🍻🚠
This issue is sponsored by:
Can you rely on your deployments?
In a recent Armory and Gartner report, 35% of respondents’ top pain point with app deployment is reliability and consistency. If you need help with consistent, reliable deployments, try Armory Continuous Deployment-as-a-Service. Check out more in the reports here.
Articles & News on monitoring.love
Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.
From The Community
An example for monitoring Selenium test results in Grafana using Prometheus and Pushgateway. If you’re already familiar with Prometheus you can skip the first half of the article.
Some tips and considerations for folks new to Lambda observability practices. Note that some of these still require practical hands-on experience with your respective service(s) in order to configure things properly.
This post is less about the evolution of serverless than an overview and collection of insights and concerns for anyone looking to adopt and maintain serverless infrastructure. But yes, it concludes with a very brief summary of some of the more popular commercial serverless monitoring services.
A wonderful article from Pinterest engineering discussing their use of time series, and how their unique needs drove the design of the current implementation.
If you’re a time series database geek you might appreciate this post from Grafana engineers on how they reworked Mimir’s store-gateway to alleviate stalling issues with queries.
One of the more useful guides I’ve seen for crafting your own custom CloudWatch metrics and alarms.
Alertmanager is an excellent tool for routing alerts, but it can suffer from crashes like any other piece of software. This post explains how it aggregates alerts, what happens to them after a crash, and how to optimize your use to minimize any unexpected behaviors.
A nice writeup from engineers at QuintoAndar on how they leverage the Graphite exporter to collect metrics from Apache Spark into their existing Prometheus cluster.
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor