In our application we send out LOTS of emails. And our clients need to control when the emails are sent and the exact content. Here is a previous post on how we attempted to solve it. We later switched to use SendGrid bulk sending API to avoid making individual API calls for every email.
Here are the basic models implemented with Mongoid:
We created a simple job and cron it to run every 5 minutes SendNewslettersJob.perform_later. If there are no newsletters to send, it does nothing.
The problem with this approach is that newsletter might go to 100 users or 100K users. And the process runs sequentially so one large sending can delay others. And it’s best to pass email addresses to SendGrid in reasonable sizes chunks (say 100 at a time).
The first step is to break up each newsletter sending into separate job so they can run in parallel.
Next let’s change it so each sending goes to a group of 100 users.
One problem with this approach is newsletter.update(status: :sent). We did not actually send the emails to the users yet, the jobs are simply queued. What we really want to do is run each sending job and update newsletter status when the last job completes.
We need to record the IDs of all individual jobs in the batch. I like using Redis for storing this kind of ephemeral data. For unique list of IDs Redis SETs are a good data structure.
We create uniuque batch_id, grab job_id and record them using SADD.
Now in each sending job upon completion we can remove its own job ID from Redis and check whether there are other jobs left.
We can now consolidate our jobs so SendNewslettersJob calls SendNewsletterUserGroupJob directly.
Also, here is a relevant post on using Sidekiq batches for data import.