Migrations Shortcomings

Migrations are supposed to be both a way to make your database
definition platform (db2, oracle, mysql, etc) agnostic, and a good way
to share database changes between team members.

While it works excellently for: creating a versionable database
definition, and not tying you to one database, it has several
shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other
team members just update from source control, will now have a
024_add_something.rb, run rake migrate, and presto, they're database
looks like every other developers.

The problem is, databases may change a lot. Picture a fictional
accounting system with:
001_add_users.rb,
002_add_invoices.rb,
003_add_payments.rb, etc

Now, during the course of development, the users table for instance may
change a lot. You may need to add new fields, maybe we changed the max
length for last names, now we want to track roles/user types, someone
stored home-phone as one field, but we want it broken out into area
code, exchange, etc. This could be a lot of little changes. If you add
a new migration for each one of these, it is MUCH harder than just
looking at one create_table call in one file to see the structure.

What we end up doing is just modifying the original 001_add_users.rb. A
simple rake migrate VERSON=0, followed by rake migrate brings you up to
date EXCEPT that you lose all your data. Fine in development, but now
you cap deploy to your test box with data that matters, and this
doesn't work.

What do people think are best practices for this?

I somewhat feel that migrations, along with code in them to transform
data when moving back/forward make a lot more sense once the code has
reached a stable point, i.e. coinciding with actual releases.

Perhaps it would be cool if running migrate generated intermediate SQL
that you could check out before creating the tables, to avoid having
one table def scattered through 8 migration files.

I totally agree - I do the same things as you.

Also, it's annoying when I make a new migration (lets say 005) and then
my partner makes one too (his will also be 005!). If you only submit
code every few days then it gets tiresome renaming them.

-Ben Lisbakken

(skip down)

Migrations are supposed to be both a way to make your database
definition platform (db2, oracle, mysql, etc) agnostic, and a good way
to share database changes between team members.

While it works excellently for: creating a versionable database
definition, and not tying you to one database, it has several
shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other
team members just update from source control, will now have a
024_add_something.rb, run rake migrate, and presto, they're database
looks like every other developers.

The problem is, databases may change a lot. Picture a fictional
accounting system with:
001_add_users.rb,
002_add_invoices.rb,
003_add_payments.rb, etc

Now, during the course of development, the users table for instance may
change a lot. You may need to add new fields, maybe we changed the max
length for last names, now we want to track roles/user types, someone
stored home-phone as one field, but we want it broken out into area
code, exchange, etc. This could be a lot of little changes. If you add
a new migration for each one of these, it is MUCH harder than just
looking at one create_table call in one file to see the structure.

What we end up doing is just modifying the original 001_add_users.rb. A
simple rake migrate VERSON=0, followed by rake migrate brings you up to
date EXCEPT that you lose all your data. Fine in development, but now
you cap deploy to your test box with data that matters, and this
doesn't work.

What do people think are best practices for this?

I think you're going to shoot yourself in the foot if you keep doing this,
but that's just me :slight_smile: I kind of view what you are doing as going back in
history with CVS/subversion and changing a file rather than keeping the
history intact and moving forward...

I don't see why having lots of little migrations is a bad thing... also,
take a look at the following rake tasks:

rake db:schema:dump # Create a db/schema.rb file that can be portably used
                     # against any DB supported by AR

rake db:structure:dump # Dump the database structure to a SQL file

And at the annotate models plugin:
http://www.agilewebdevelopment.com/plugins/annotate_models

Which will spit out migration-like syntax as comments at the top of each
model based on the db at that time...

Migrations are supposed to be both a way to make your database
definition platform (db2, oracle, mysql, etc) agnostic, and a good way
to share database changes between team members.

While it works excellently for: creating a versionable database
definition, and not tying you to one database, it has several
shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other
team members just update from source control, will now have a
024_add_something.rb, run rake migrate, and presto, they're database
looks like every other developers.

The problem is, databases may change a lot. Picture a fictional
accounting system with:
001_add_users.rb,
002_add_invoices.rb,
003_add_payments.rb, etc

Now, during the course of development, the users table for instance may
change a lot. You may need to add new fields, maybe we changed the max
length for last names, now we want to track roles/user types, someone
stored home-phone as one field, but we want it broken out into area
code, exchange, etc. This could be a lot of little changes. If you add
a new migration for each one of these, it is MUCH harder than just
looking at one create_table call in one file to see the structure.

Sean,

I pretty much second Philip's opinion on this. You shouldn't use the migrations as a way to view the current state of the database. They are really designed to allow you to move from one known database state to another with minimal effort - especially if your application is deployed to multiple locations. If I'm right, every time you migrate the database, the rake migrate task dumps the current schema (in either Ruby or SQL format).

What we end up doing is just modifying the original 001_add_users.rb. A
simple rake migrate VERSON=0, followed by rake migrate brings you up to
date EXCEPT that you lose all your data. Fine in development, but now
you cap deploy to your test box with data that matters, and this
doesn't work.

What do people think are best practices for this?

I think if you really want to go down the route of "clean" migrations, the best you can do is this:

* during development, do rake migrate VERSON=0 as often as you want while writing the migrations
* deploy your application at version N
* for future development, do rake migrate VERSON=N until your next set of migrations is ready
* deploy version M>N
* etc...

You can combine this with a continuous integration system. I haven't used DamageControl (correct me if that isn't a Ruby CI tool!) but I imagine it would work happily like that.

In the absence of a CI tool, I'd just relax, write as many migrations as you need to, and let the rake voodoo take care of the rest!

Ashley

I have had an issue with this regarding migrations in the past.

Example

01-Make a user migration

05-Modify user with a new migration

10-Add a column to user with validtaions in the model

11-other migrations that I don’t want to loose

Now if I want to roll back to version 5 in the database? My application is dead.

The applications has evolved and cannot roll back, except through the use of svn or similar.

Am I missing something fundamental in the use of migrations, because to
me it seems that the database and application are not completely
independant, where migrations seem to assume that they are.

Daniel ----- wrote:

Example

01-Make a user migration

05-Modify user with a new migration

10-Add a column to user with validtaions in the model

11-other migrations that I don't want to loose

Now if I want to roll back to version 5 in the database? My application
is
dead.

The applications has evolved and cannot roll back, except through the

use of
svn or similar.

Exactly, through svn or similar. What’s the confusion?

Well I guess where I’m not sure is what is the point of being able to
roll back the schema independantly of the app, when the
application can’t cope with it.

In order for a db to be rolled back, you must use the latest version of
you apps (trunk) migration files to go back to a version that is
compatible with svn version xxx.

It’s not like you can checkout svn version xxx and run the migration in
it to roll back the database, since the new migration down methods
aren’t there.

Am I missing something fundamental in the use of migrations, because to

me
it seems that the database and application are not completely
independant,
where migrations seem to assume that they are.

No, they aren’t independent. I don’t see how migrations assume they are.

If I can roll back migrations without rolling back an app then I’d say
migrations assume that they are independent. Just my opinion
though and I’d really like to understand this better.

I use migrations always, I’m just not sure how robust they are, or what is the best way to use them robustly.

Cheers

Until the application is deployed for the first time, I always use a single migration that I constantly edit and reload. I use a db:reload task for this:

namespace :db do
desc ‘Wipe the database, migrate, and load fixtures’

task :reload => :environment do
raise “Will not run db:reload in production” if RAILS_ENV == ‘production’

connection = ActiveRecord::Base.connection
database = connection.instance_variable_get

(’@config’)[:database]

connection.drop_database(database)
connection.create_database(database)
connection.execute("use #{database}")

Rake::Task['db:migrate'].invoke
Rake::Task['db:fixtures:load'].invoke

end
end

-Jonathan.

Well I guess where I'm not sure is what is the point of being able to roll
back the schema independantly of the app, when the application can't cope
with it.

In order for a db to be rolled back, you must use the latest version of you
apps (trunk) migration files to go back to a version that is compatible with
svn version xxx.

It's not like you can checkout svn version xxx and run the migration in it
to roll back the database, since the new migration down methods aren't
there.

Not exactly. If you wish to roll back from revision 9678 of your source
to 9542, you can look in source control and see what the most recent
migration revision was at the time of 9542's checkin. You can then
migrate back down to that version, and THEN revert your codebase to
revision 9542, and you will be in sync.

As far as looking at schema.rb for the overall view of your database, I
stupidly was not aware/thinking of this.

I still think that in _development_, the use of multiple little
migrations is a bad choice. Migration number collisions have happened
multiple times that I've seen, and I do not think it is a people
problem. If I want to create a new table (hence new migration), I
shouldn't have to yell around the office, or circulate an email saying
"Did anyone create a new migration they didn't check in? I'm making one
now, so svn ci so we don't collide".

Just because I make a new migration, I may not want to check it in
right away. What if I define a table, write the model code and a
controller, work with it a little, and decide I want to lay the table
out different? I haven't checked in, I should be able to just modify
the migration. Write more code, test some more, change the table again,
etc. When I feel satisfied enough to check in, only then should I have
to.

The idea of "concurrent" version control is that I shouldn't have to be
totally in sync with my team for everyting. If I check in a conflict,
the tool (source control) will tell me that it can't merge it, and then
I will use my brain to merge the files myself.

On that note, in the case of migration conflicts, if dev A checks in
012_add_videos.rb and dev B checks in 012_add_albums.rb, source control
is fine with it. If the migration is one file, the conflict will
usually be flagged during check in.

So far I'm much more convinced that a single migration in development
makes much more sense, and the versioning functionality of multiple
migrations makes a lot more sense in moving between stable releases. As
for all of the people that talk of "throwing away the history", that's
what source control is for - I can always look at previous revisions in
the history to see previous changes.

I’ll pay that one :wink:

thanx for the response.

Hi !

01-Make a user migration
<snip>
05-Modify user with a new migration
<snip>
10-Add a column to user with validtaions in the model
<snip>
11-other migrations that I don't want to loose

What you need to do is to add models INSIDE your migration:

class MigrateEstimateAddressesToAddress < ActiveRecord::Migration
  def self.up
    Estimate.find(:all, :conditions => ['address_line1 IS NOT NULL AND
LENGTH(address_line1) > 0']).each do |estimate|
      ...
    end

    Address.update_all("addressable_type = 'Estimate'",
"addressable_type LIKE 'MigrateEstimateAddressesToAddress%'")
  end

  def self.down
    Address.find(:all, :conditions => ["addressable_type =
'Estimate'"]).each do |address|
      ...
    end

    Address.delete_all("addressable_type = 'Estimate'")
  end

  class Address < ActiveRecord::Base
    belongs_to :addressable, :polymorphic => true
  end

  class Estimate < ActiveRecord::Base
    has_one :address, :as => :addressable
  end
end

I found this trick from Toolman Tim here:
http://toolmantim.com/article/2006/2/23/migrating_with_models

This is a GREAT way to manage your models. Plus, you can create
models just for your migrational needs. Maybe you don't need the full
validation model, or you don't need the relations. It doesn't matter,
this model is only used during migrations.

The only caveat is as described above. Polymorphic assocations must
be manually corrected. STI doesn't seem to have this problem.

I agree though that plain version numbers can cut it for ordering
migrations when multiple developers are used. I thought of using UTC
timestamps instead, but it still doesn't help, because I need to run
unrun migrations.

What we'd really need is a table that knew which migrations were
applied. A "patch" oriented way of working, if you will. Then, we
need dependencies between migrations... This is quickly becoming hard
to manage.

What I really want is this:

class MigrateEstimateAddressesToAddress < ActiveRecord::Migration
  depends_on :create_addresses, :create_estimates

  def self.up
  end

  def self.down
  end
end

The DB should have a table such as this one:

CREATE TABLE migrations (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(255) NOT NULL,
  applied_at DATETIME NOT NULL)

For dependencies, we'd need another table, obviously. You get the drift.

Hope that helps !

Ben Lisbakken wrote:
> I totally agree - I do the same things as you.
>
> Also, it's annoying when I make a new migration (lets say 005) and then
> my partner makes one too (his will also be 005!). If you only submit
> code every few days then it gets tiresome renaming them.

That is a communication problem, not a migrations problem. You need to
coordinate schema updates just like you have to avoid editing the same
part of the same source file.

Yes, we have to do this every time. Its a "shout over the wall" kind
of thing. In practice, we walk around and get a census before starting
any migration.

To answer the original poster, I'd suggest some other way of
enumerating the migration numbers. This could be like annotating the
number with a userid :

007-edh_create_users.rb
007-mm_add_pay_date_to_payments.rb

This would alleviate the issue of duplicate names, at least. I realize
it doesn't fix the problem of incompatible changes to the same table,
but that is really a design team issue.

In my ideal world, I'd make the migration naming thing hookable. Then
each team could devise their own coordinated enumeration scheme. A
shared DB, for instance, to which you insert a new record every time
you generate a new migration/model, the ID becomes the schema version
#. Combine that with one of the other poster's rake task that does an
update, a migration, then a commit all inside a transaction making the
whole thing atomic.

Ed

It makes it very easy to edit migration files that have already been loaded into the development database. It wipes the slate and starts afresh. I’m never worried about the data in my development database as I usually use the test fixtures as development data, it’s standard practice for the projects I work on.

Without such a rake task, if I edit a migration that has already been loaded into the development database it can be a pain to rerun it with the changes. Often you can simply roll back to the previous migration then migrate up again, but that won’t work if you’ve also altered the downwards migration.

At least for me, it is much less work to just type rake db:reload

-Jonathan.