I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Rails and complex data migrations
Oct 27, 2016
When working with NoSQL DBs we do not worry about schema changes but we still need to do data migrations. We have been using mongoid_rails_migrations for this.
And sometimes they are more complex. We can have 30+ lines in the up method as we are looping through records, validating / transforming the data and then updating / creating other records in our DB. Why not move that logic into separate private methods in the migration class (it’s a Ruby class after all) and call them as needed?
Exception handling
When running these migrations it might be OK to just skip a few errors and continue. For that we can use exceptions. I also like to use limit clause to speed things up when debugging.
Testing
Sometimes the migrations are so complex that we want to write actual automated tests.
The same approach should work with data migrations in SQL DBs. Just treat migrations as Ruby classes and test their methods.
We now need to rename User model to Person. We can rename the class and DB table but how do we change the article relationships? Well, as long as the IDs of indvividual person/user records did not change we can do this:
Now we need to rename Group to Team. Here is the migraiton.
Lots of data
Let’s imagine a blogging platform.
Now we need to create a relationship between comment and article author.
And we need a migration to update records. But we have millions of comments and thousands of articles. This will be VERY slow as it will query for each article AND user and then do indvividual updates.
This will be faster because it will eager load related articles. But it will require lots of RAM.
This will be even faster because it will do bulk updates for ALL comments for specific article but will still require lots of RAM.
This will break up work into smaller chunks for each group of users (by company). It will require far less RAM.
Alternatively we could batch users. With ActiveRecord we could use find_in_batches. For Mongoid use something like this gist