Data Migrations using Model Classes vs. SQL

I'm working on my first official migration that inserts data rather than changing the structure of the database; there's some data we'd like to be present in any database, so it seems sensible to put it in a migration.

However, I'm realizing that it's not all that simple. The obvious way to do a data migration is to use the model class. Imagine that I have an organization with a mandatory name and an optional description:

e.g. create_table :organizations do |t|   t.column :name, :string, :null => false   t.column :description, :string end

So, for instance, if I want an organization called "Us", i might do:

def self.up   Organization.new( :name => 'Us' ).save! end

This works ok for now, but I'm realizing that it couples the migration to the 'current' version of the model class. For instance, if I later make description mandatory, and validate that (validates_presence_of), this migration will cease to work. That means that a new developer won't be able to get the database by running the migration, because the migration will complain that the organization does not have a description.

So -- does that mean I should do all data migrations using SQL statements? How are other people approaching this?

  - Geoffrey

I think there's a recipe In the "Rails Recipes" book for this

I'm working on my first official migration that inserts data rather than changing the structure of the database; there's some data we'd like to be present in any database, so it seems sensible to put it in a migration.

However, I'm realizing that it's not all that simple. The obvious way to do a data migration is to use the model class. Imagine that I have an organization with a mandatory name and an optional description:

e.g. create_table :organizations do |t|   t.column :name, :string, :null => false   t.column :description, :string end

So, for instance, if I want an organization called "Us", i might do:

def self.up   Organization.new( :name => 'Us' ).save! end

You can simplify that to Organization.create :name => 'Us'

This works ok for now, but I'm realizing that it couples the migration to the 'current' version of the model class. For instance, if I later make description mandatory, and validate that (validates_presence_of), this migration will cease to work. That means that a new developer won't be able to get the database by running the migration, because the migration will complain that the organization does not have a description.

So -- does that mean I should do all data migrations using SQL statements? How are other people approaching this?

One thing you can do is to pre-declare your model class at the top of your migration file:

   class Organization < ActiveRecord::Base    end

   class MyMigration < ActiveRecord::Migration       def self.up          ...

Since the Organization class exists, Rails won't try to load your model file. Without the model file, you don't get validations, associations, etc.

Two important comments on this...

I'm working on my first official migration that inserts data rather than changing the structure of the database; there's some data we'd like to be present in any database, so it seems sensible to put it in a migration.

However, I'm realizing that it's not all that simple. The obvious way to do a data migration is to use the model class. Imagine that I have an organization with a mandatory name and an optional description:

e.g. create_table :organizations do |t|   t.column :name, :string, :null => false   t.column :description, :string end

So, for instance, if I want an organization called "Us", i might do:

def self.up   Organization.new( :name => 'Us' ).save! end

You can simplify that to Organization.create :name => 'Us'

If you create or change the table, then you must call Organization.reset_column_information before trying to do anything through the model.

This works ok for now, but I'm realizing that it couples the migration to the 'current' version of the model class. For instance, if I later make description mandatory, and validate that (validates_presence_of), this migration will cease to work. That means that a new developer won't be able to get the database by running the migration, because the migration will complain that the organization does not have a description.

So -- does that mean I should do all data migrations using SQL statements? How are other people approaching this?

One thing you can do is to pre-declare your model class at the top of your migration file:

   class Organization < ActiveRecord::Base    end

   class MyMigration < ActiveRecord::Migration       def self.up          ...

Since the Organization class exists, Rails won't try to load your model file. Without the model file, you don't get validations, associations, etc.

Actually, you want to declare a minimal version of your model *INSIDE* the migration class like so:

   class MyMigration < ActiveRecord::Migration      class Organization < ActiveRecord::Base      end

     def self.up           ...      end    end

This way, you can have several migrations that use "the" model. You can even define little helper methods that might only make sense during the migration.

It's nice to use 'say' or 'say_with_time' in the migration if you're doing a lot of data shuffling so you (or the other developers) don't assume that the migration is stuck.

-Rob

Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com

Ah, so there is; just for anyone who finds this thread later, this is mostly the same information that's being delivered by other people on this thread, so if you don't have the book handy, just read on.