8

Background job queues and priorities may be the wrong path

 7 months ago
source link: https://alexis.bernard.io/blog/2023-10-15-background-job-queues-and-priorities-may-be-the-wrong-path.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Background job queues and priorities may be the wrong path

October 15, 2023

I got this idea after reading the Reflections on GoodJob for Solid Queue from Ben Sheldon, who is the author of the excellent GoodJob.

I shared the same idea that purpose queues (mail, API, stats, …) were bad. I was convinced the best practice was to have job queues that define the priority, such as low, medium, and high. But after reading the following part, I completely changed my mind :

Queue design and multi-queue execution pools. I do think queue design is a place where lots of people do it wrong. I believe queues should be organized by maximum total latency SLO (latency_15s, latency_15m , latency_8h) and not by their purpose or dependencies (mailers, billing, api). Nate Berkopec believes similarly. And I think that informs that execution pools (e.g. thread pools) should be able to work from multiple queues and have independent concurrency configuration (e.g. number of threads), both to ease transition from the latter to the former, but also because it allows sharing resources as optimally as possible (having 3 separate pools that pull from "latency_15s", "latency_15m, latency_15s", and "latency_8h,*" in GoodJob’s syntax). I personally think concepts like priority or ordered-queues lead to bad queue design, so I wouldn’t sweat that. Any ordering regime more complex than first-in-first-out (FIFO) prioritizes capacity (or lack thereof) over latency. This might sound strange coming from me who champions running workloads in the webbrowser on tiny dynos, but it’s different in my mind: I don’t think it’s possible to meet a latency target through prioritization when there is a fundamental lack of capacity.

Indeed, defining latency queues (latency_15s, latency_15m, latency_8h) is much better. Because if the high priority queue is always busy, others will never start. However, with the idea of SLO latency, we could totally get rid of queues.

A job needs two attributes to define when it should be started: run_at and max_latency. That means the job worker only needs to order them by run_at + max_latency, and takes the first. It seems both flexible and simple.

Here is an example to illustrate :


        class ApplicationJob < ActiveJob::Base
          max_latency 10.minutes # Arbitrary default value
        end

        class SendBoringEmailJob < ApplicationJob
          max_latency 30.minutes # These jobs will be started in less that 30 minutes by default
        end

        class SendImportantEmailJob < ApplicationJob
          max_latency 1.minute # These jobs will be started in less than a minute by default
        end

        # Of course, it should be possible to override default values
        # Job 1 will start between 10 and 15 minutes because it tolerates 5 minutes of latency
        SendBoringEmailJob.set(run_at: 10.minutes.from_now, max_latency: 5.minutes).perform_later

        # Job 2 will start in 11 minutes and does not tolerate latency
        SendBoringEmailJob.set(run_at: 11.minutes.from_now, max_latency: 0).perform_later

        # If workers are quiet, job1 will be run first in 10 minutes
        # If workers are busy, job2 will be run first in 11 minutes
        # If workers are too busy, both jobs will exceed their max latency.
        # The last case is a great indicator for either optimizing jobs or adding workers.
      

After having this revelation, I strongly believe that background jobs should neither have queues and priorities. Indeed, a maximum latency is enough to respect a Service Level Objective, which is much better. Job queues are from the past. Viva latency SLO!


Home


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK