I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
In previous post I wrote about pre-generating cache via background jobs. I described an example of an online banking app where we pre-generate cache of
recent_transactions. This helps even load on the system by pushing some of the data into cache before visitors come to the site.
- One job for all records
- One job for each record
- Loop through records in slices
- Different queues and workers
One job for all records
The simplest design is to loop through all records in one job.
The downside of this approach is that if we have millions of
MyModel records it can take a very long time for this job to complete. And what if we need to deploy code which restarts background job workers? We won’t know which records have been processed and which have not. Best practices for background jobs recommend keeping them small and idempotent.
One job for each record
We can queue one job per record by separating our code into two jobs.
Each job will complete very quickly and they will run in parallel. Since it is not recommended to serialize complete objects into queue we will use some kind of record identifier (like globalid). But this will cause lot of queries against the primary DB to look up records one at a time.
Loop through records in slices
And now we come to the Goldilocks solution - not too big and not too small. We want to break up the process into smaller chunks but instead of processing one record at a time we will process several (let’s say 10).
One downside of this approach is that
pluck will request IDs for ALL records from the primary DB. Then it will store them in array and loop through them. Different ORMs support
batch_size for querying records so we can do equivalent of
select id from TableName limit 10 offset ....
Different queues and workers
The same approach can be applied to other situations (not just cache pre-generating). When a record is created/updated we might have a callback (previous post) to update various reports. The primary
UpdateReportsJob will be called from
after_save callback. We want it to complete as quickly as possibly and queue separate
UpdateEachReportJob passing appropriate report ID. We can process these jobs through separate queues.
This way each server will have a dedicated process watching only the
high queue to ensure that those jobs complete as quickly as possible and not get backlogged. The other three workers will process the
default (used for other jobs) and
low queues (used for reports).