I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Distributing DB load when running background jobs
What if we had a multi-tenant system where we needed to generate various reports? Typically we would do it at night as the load on the DB is usually less at that time.
One job
A simple solution is to create a background process using ActiveJob with Sidekiq backend.
We can schedule it using sidekiq-cron.
Multiple jobs
The problem is this will create a long running job which could fail in the middle. What we want to do is separate report scheduling from report generating. We will be using GlobalID to identify tenants.
The problem is that now all jobs will be running at the same time putting extra load on our DB at once. Instead we will modify our code to schedule the first job immediately, second job in 5 minutes, third in 10 minutes and so on. Sidekiq will use Redis Sorted Sets to delay job execution.
This approach might not scale if we have hundreds of tenants because the delay will be too long. So we would need to adjust the gap from 5 minutes to something less. But this is a simple way of distributing load on the DB and potentially saving $ on hosting costs.
Links
- http://sidekiq.org
- https://redis.io/
- https://github.com/ondrejbartas/sidekiq-cron
- https://github.com/rails/globalid