Monitoring Puma web server with Prometheus and Grafana

 3 years ago
source link: https://www.tuicool.com/articles/hit/V7Jzyqy
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.


Localizing web application performance problems and response latency could be tricky in the projects with complex infrastructure.

And having monitoring for all the services is highly crucial.

Sometimes performance degradation might be induced one step ahead of the main application because of the lack of the web server capacity. As the most popular web server for running Ruby web applications is Puma , let me explain how to implement and tune up the simple monitoring for it.

Puma control application

Puma has a built-in control application for managing web-server and asking it's internal statistics. Control application actives by the following code, called from Puma configuration file puma/config.rb :

activate_control_app('tcp://127.0.01:9000', auth_token: 'top_secret')
# or without any params, just activate_control_app

For serving it, Puma runs the separate web-server instance with the specific rack backend application .

This backend returns the worker statistics data in JSON format:

# GET /stats?secret=top_secret
  "worker_status"=> [
        "backlog"=>0, "running"=>5, "pool_capacity"=>5, "max_threads"=>5
        "backlog"=>0, "running"=>5, "pool_capacity"=>5, "max_threads"=>5

The exact answer schema depends on the Puma configuration: when it is in clustered mode (has more than one worker), the output describes each worker.

If Puma is in non-clustered, the result describes only the single worker with rapid output:


Anyway, there are meaningful metrics for monitoring purposes, such as:


Using them, we can automate the monitoring system, which checks the values periodically any Puma cluster and show metrics in dynamic, such as shown below:


Decreasing pool_capacity means raising the load of the server. It is the starting point of rising the request processing time latency by capacity issues.

Yabeda framework

For the Ruby world, we have the extendable framework for collecting and exporting metrics, which is called Yabeda .

It provides a simple DSL for describing the metrics and fetching their values with a simple lambda function.

For now, Yabeda framework provides solutions for monitoring Rails and Sidekiq out of the box.


Monitoring considers periodically storing the metric values for future analysis. And one of the most popular and suitable solutions for that is Prometheus . As Prometheus implements the "HTTP pull model," it expects the monitorable subject to expose some endpoint with the metrics value in the specific format.

Yabeda framework allows exporting metrics with the help of Prometheus Exporter .

Yabeda for Puma

Now I am going to introduce one more new monitoring solution of the Yabeda family - puma monitoring plugin .

It just needs to load the yabeda-puma-plugin gem and to configure Puma web server with following lines in puma/config.rb file:

plugin :yabeda

That's it. After the Puma web server start, the plugin will do all the job for collection the metrics.

Get things together

Here is the overall architecture of the Puma monitoring solution:


It gets all the metrics from Puma control application statistics and consolidates them to the Yabeda framework. Values could be exported by Prometheus rack-middleware, serving the /metrics path of the web application and providing metrics values in prometheus-friendly format. Here is the sample response of metrics endpoint for Puma, configured with two workers:

GET /metrics

puma_backlog{index="0"} 0
puma_backlog{index="1"} 0
puma_running{index="0"} 5
puma_running{index="1"} 5
puma_pool_capacity{index="0"} 1
puma_pool_capacity{index="1"} 5
puma_max_threads{index="0"} 5
puma_max_threads{index="1"} 5
puma_workers 2
puma_booted_workers 2
puma_old_workers 0


Depending on your needs, the data could be visualized in many ways; here is the example of basic summarized metrics values:


This diagram shows the overall metrics values for all the Puma workers. Also, indicators could be displayed separately for all the workers, or all the Puma cluster instances.

"Application busy" metric

Looking at all the raw Puma metrics might be not visually comfortable to make some quick overview of the system in general. More suitable way if to calculate the composite metric, describing the overall workload of the web-server in percentage. Let call it "Application busy" or just busy-metric . Formula evaluates the percentage of overall workload:

(1 - pool_capacity / max_threads) * 100

It turns out to have the only chart instead of several:


The busy-metric looks to be more informative to overview the health of the system. It shows the actual workload of overall Puma cluster in a more friendly way. When busy-metrics sticks up, it means that the application is under high load, and it probably needs to tune up the Puma web server.

Busy-metricallows to determine the problem state easily, but for specific incident investigation, raw metrics might be more helpful and advisable.

Metrics playground

Yabeda framework supply the example project with all the monitoring infrastructure set up for monitoring the Sidekiq, Rails, and Puma . It is easy to set it up with docker-compose.


Setting up the monitoring infrastructure makes to build more stable and maintainable software, and sleep calmly at night.

Monitoring is made easy with Yabeda framework.

Check out the yabeda-puma-plugin for getting ready to monitor Puma!

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK