I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Modular code structure
More thoughts on structuring code and running it via background jobs. This post was inspired by me trying to wrap my head around Sandi Metz’ Rules For Developers.
Let’s imagine we need to import CSV file into our DB in a Rails application. It’s easy to write a simple class with a single method (or a Rake task).
# app/services/user_import.rb
class UserImport
def perform
CSV.parse(File.read('...'), headers: true).each do |row|
User.create! row.to_hash
end
end
end
You can run it via rails r UserImport.new.perform
. But it’s hard to test this code. You need to create different CSV files with valid and invalid data. It is also harder to scale this. Next step is to break up into reading the file and processing each row.
# app/services/user_import.rb
class UserImport
def perform
CSV.parse(File.read('...'), headers: true).each do |row|
process_row row.to_hash
end
end
private
def process_row row
User.create! row
end
end
You can test private method process_row with .send
and pass various params. But it’s still going to process the records one at a time which is slow. And what if you restart server? So let’s break up code into separate classes.
# app/services/user_import.rb
class UserImport
def perform
CSV.parse(File.read('...'), headers: true).each do |row|
ProcessUser.new.perform row.to_hash
end
end
end
# app/services/process_user.rb
class ProcessUser
def perform row
User.create! row
end
end
Now let’s wrap each service object into ActiveJob. You want to use something like Resque / Sidekiq / SQS so job queueing is very fast. This will allow you to quickly queue up the jobs and process them in background parallel to each other. Even if you completely shutdown both webserver and background job process the jobs will still be persisted.
# app/jobs/user_import_job.rb
class UserImportJob < ApplicationJob
def perform
UserImport.new.perform
end
end
# app/services/user_import.rb
class UserImport
def perform
CSV.parse(File.read('...'), headers: true).each do |row|
ProcessUserJob.perform_later row.to_hash
end
end
end
# app/jobs/process_user_job.rb
class ProcessUserJob < ApplicationJob
def perform row
ProcessUser.perform row
end
end
# app/services/process_user.rb
class ProcessUser
def perform row
User.create! row
end
end
As you can see the jobs are just very thin wrappers around service objects. But what if you don’t want to have separate classes?
# app/jobs/user_import_job.rb
class UserImportJob < ApplicationJob
def perform
process_file
end
private
def process_file
CSV.parse(File.read('...'), headers: true).each do |row|
ProcessUserJob.perform_later row.to_hash
end
end
end
# app/jobs/process_user_job.rb
class ProcessUserJob < ApplicationJob
def perform row
process_user row
end
private
def process_user row
User.create! row
end
end
We are back to just 2 files but the actual business logic is encapsulated in private methods which can be eaisly tested like any Ruby methods. Here is a great blog post.
Overall the amount of code increased from 7 lines total in first snippet to 21 but most of that code is simple class declarations and method definitions. In real applications your business logic will be much more complex and comprise much higher % of code. So a little bit of overhead that comes with breaking the code apart will matter much less. However the modularity and ease of understanding / testing your code will more thay pay off.