Migration and Seed Ground Rules
I found the proper seeding and migration practices somewhat scattered all over the web, so here’s a writeup of what I think are the best practices for migrations and seeding in Rails (3.2 as of writing this).
Migrations should contain:
- schema DSL
- raw SQL
- properly isolated ActiveRecord classes
Migrations must not contain:
- any dependence on code that will change in the future
- seed data
Application code grows and changes as your app evolves, migrations should be timeless and always valid. Using your domain code in migrations makes them brittle and prone to becoming invalid.
seeds.rb
should contain:
- data creation statements using application code
How seeds.rb
should behave:
- it should be idempotent (it can be run many times and it will ensure everything is present in the database without breaking)
- it must not destroy existing data
- it must break loudly if some of the data failed to be created
How does a good migration look?
Having an isolated ActiveRecord model inside the migration is convenient if the migration is not trivial and we can benefit from having a true ActiveRecord model at hand.
How does a good seeds.rb
look?
Using bang methods ensures we don’t miss validation errors.
Using find_or_create_by
ensures we create the records only if they are not present.
If seeds change we can rerun the seed task. If already existing seeds must be changed, it should be handled in migrations through update or delete statements.
All of this ensures that:
- migrations will be completely independent from the rest of the application code, and will not break even if run from the very start
- seeds are all in one place
db:seed
can always be run safely (e.g. as part of deployment)
This implies that when deploying both db:migrate
and db:seed
must be run.
I think invoking db:seed
from a migration only when necessary might be a good solution too.
This covers topics that I struggled to find coherent answers to, and this certainly isn’t the one true way of doing this.