Stitcher for Podcasts

Get the App Open App
Bummer! You're not a
Stitcher Premium subscriber yet.
Learn More
Start Free Trial
$4.99/Month after free trial

Episode Info

Episode Info:

Uber manages the car rides for millions of people. The Uber system must remain operational 24/7, and the app involves financial transactions and the safety of passengers.

Uber infrastructure runs across thousands of server instances and produce terabytes of monitoring data. The monitoring data is used to understand the health of the software systems as well as relevant business metrics, such as driver efficiency, daily revenues, and user satisfaction.

Uber adopted the Prometheus monitoring system to manage their monitoring data. Prometheus regularly scrapes metrics across infrastructure to gather time series data about the state of everything across Uber. As the usage of Prometheus has grown within the company, Uber has had to figure out how to scale their monitoring platform.

M3 is a monitoring system built at Uber to scale Prometheus and provide a platform that can effectively scale the data storage as well as the query serving. Rob Skillington is a staff software engineer at Uber, and he joins the show to talk about monitoring at Uber–from the requirements of the system to the implementation of M3.

At Uber, M3 powers dashboards, ad-hoc queries, and alerting. M3 was open sourced to give other users access to a scalable Prometheus solution. In a previous episode with Brian Boreham, we discussed one strategy for scaling Prometheus. Today’s episode covers another scalability solution, with M3.


Show notes

The post Uber’s Monitoring Platform with Rob Skillington appeared first on Software Engineering Daily.

Read more »

Discover more stories like this.

Like Stitcher On Facebook


Episode Options

Listen Whenever

Similar Episodes

Related Episodes