Migrations Shortcomings

Migrations are supposed to be both a way to make your database definition platform (db2, oracle, mysql, etc) agnostic, and a good way to share database changes between team members.

While it works excellently for: creating a versionable database definition, and not tying you to one database, it has several shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other team members just update from source control, will now have a 024_add_something.rb, run rake migrate, and presto, they're database looks like every other developers.

The problem is, databases may change a lot. Picture a fictional accounting system with: 001_add_users.rb, 002_add_invoices.rb, 003_add_payments.rb, etc

Now, during the course of development, the users table for instance may change a lot. You may need to add new fields, maybe we changed the max length for last names, now we want to track roles/user types, someone stored home-phone as one field, but we want it broken out into area code, exchange, etc. This could be a lot of little changes. If you add a new migration for each one of these, it is MUCH harder than just looking at one create_table call in one file to see the structure.

What we end up doing is just modifying the original 001_add_users.rb. A simple rake migrate VERSON=0, followed by rake migrate brings you up to date EXCEPT that you lose all your data. Fine in development, but now you cap deploy to your test box with data that matters, and this doesn't work.

What do people think are best practices for this?

I somewhat feel that migrations, along with code in them to transform data when moving back/forward make a lot more sense once the code has reached a stable point, i.e. coinciding with actual releases.

Perhaps it would be cool if running migrate generated intermediate SQL that you could check out before creating the tables, to avoid having one table def scattered through 8 migration files.

I totally agree - I do the same things as you.

Also, it's annoying when I make a new migration (lets say 005) and then my partner makes one too (his will also be 005!). If you only submit code every few days then it gets tiresome renaming them.

-Ben Lisbakken

(skip down)

Migrations are supposed to be both a way to make your database definition platform (db2, oracle, mysql, etc) agnostic, and a good way to share database changes between team members.

While it works excellently for: creating a versionable database definition, and not tying you to one database, it has several shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other team members just update from source control, will now have a 024_add_something.rb, run rake migrate, and presto, they're database looks like every other developers.

The problem is, databases may change a lot. Picture a fictional accounting system with: 001_add_users.rb, 002_add_invoices.rb, 003_add_payments.rb, etc

Now, during the course of development, the users table for instance may change a lot. You may need to add new fields, maybe we changed the max length for last names, now we want to track roles/user types, someone stored home-phone as one field, but we want it broken out into area code, exchange, etc. This could be a lot of little changes. If you add a new migration for each one of these, it is MUCH harder than just looking at one create_table call in one file to see the structure.

What we end up doing is just modifying the original 001_add_users.rb. A simple rake migrate VERSON=0, followed by rake migrate brings you up to date EXCEPT that you lose all your data. Fine in development, but now you cap deploy to your test box with data that matters, and this doesn't work.

What do people think are best practices for this?

I think you're going to shoot yourself in the foot if you keep doing this, but that's just me :slight_smile: I kind of view what you are doing as going back in history with CVS/subversion and changing a file rather than keeping the history intact and moving forward...

I don't see why having lots of little migrations is a bad thing... also, take a look at the following rake tasks:

rake db:schema:dump # Create a db/schema.rb file that can be portably used                      # against any DB supported by AR

rake db:structure:dump # Dump the database structure to a SQL file

And at the annotate models plugin: http://www.agilewebdevelopment.com/plugins/annotate_models

Which will spit out migration-like syntax as comments at the top of each model based on the db at that time...

Migrations are supposed to be both a way to make your database definition platform (db2, oracle, mysql, etc) agnostic, and a good way to share database changes between team members.

While it works excellently for: creating a versionable database definition, and not tying you to one database, it has several shortcomings in my eyes.

The idea is, when you make a change, it becomes a new migration. Other team members just update from source control, will now have a 024_add_something.rb, run rake migrate, and presto, they're database looks like every other developers.

The problem is, databases may change a lot. Picture a fictional accounting system with: 001_add_users.rb, 002_add_invoices.rb, 003_add_payments.rb, etc

Now, during the course of development, the users table for instance may change a lot. You may need to add new fields, maybe we changed the max length for last names, now we want to track roles/user types, someone stored home-phone as one field, but we want it broken out into area code, exchange, etc. This could be a lot of little changes. If you add a new migration for each one of these, it is MUCH harder than just looking at one create_table call in one file to see the structure.

Sean,

I pretty much second Philip's opinion on this. You shouldn't use the migrations as a way to view the current state of the database. They are really designed to allow you to move from one known database state to another with minimal effort - especially if your application is deployed to multiple locations. If I'm right, every time you migrate the database, the rake migrate task dumps the current schema (in either Ruby or SQL format).

What we end up doing is just modifying the original 001_add_users.rb. A simple rake migrate VERSON=0, followed by rake migrate brings you up to date EXCEPT that you lose all your data. Fine in development, but now you cap deploy to your test box with data that matters, and this doesn't work.

What do people think are best practices for this?

I think if you really want to go down the route of "clean" migrations, the best you can do is this:

* during development, do rake migrate VERSON=0 as often as you want while writing the migrations * deploy your application at version N * for future development, do rake migrate VERSON=N until your next set of migrations is ready * deploy version M>N * etc...

You can combine this with a continuous integration system. I haven't used DamageControl (correct me if that isn't a Ruby CI tool!) but I imagine it would work happily like that.

In the absence of a CI tool, I'd just relax, write as many migrations as you need to, and let the rake voodoo take care of the rest!

Ashley

I have had an issue with this regarding migrations in the past.

Example

01-Make a user migration

05-Modify user with a new migration

10-Add a column to user with validtaions in the model

11-other migrations that I don’t want to loose

Now if I want to roll back to version 5 in the database? My application is dead.

The applications has evolved and cannot roll back, except through the use of svn or similar.

Am I missing something fundamental in the use of migrations, because to me it seems that the database and application are not completely independant, where migrations seem to assume that they are.

Daniel ----- wrote:

Example

01-Make a user migration 05-Modify user with a new migration 10-Add a column to user with validtaions in the model

11-other migrations that I don't want to loose

Now if I want to roll back to version 5 in the database? My application is dead.

The applications has evolved and cannot roll back, except through the

use of svn or similar.

Exactly, through svn or similar. What’s the confusion?

Well I guess where I’m not sure is what is the point of being able to roll back the schema independantly of the app, when the application can’t cope with it.

In order for a db to be rolled back, you must use the latest version of you apps (trunk) migration files to go back to a version that is compatible with svn version xxx.

It’s not like you can checkout svn version xxx and run the migration in it to roll back the database, since the new migration down methods aren’t there.

Am I missing something fundamental in the use of migrations, because to

me it seems that the database and application are not completely independant, where migrations seem to assume that they are.

No, they aren’t independent. I don’t see how migrations assume they are.

If I can roll back migrations without rolling back an app then I’d say migrations assume that they are independent. Just my opinion though and I’d really like to understand this better.

I use migrations always, I’m just not sure how robust they are, or what is the best way to use them robustly.

Cheers

Until the application is deployed for the first time, I always use a single migration that I constantly edit and reload. I use a db:reload task for this:

namespace :db do desc ‘Wipe the database, migrate, and load fixtures’

task :reload => :environment do raise “Will not run db:reload in production” if RAILS_ENV == ‘production’

connection = ActiveRecord::Base.connection
database = connection.instance_variable_get

(‘@config’)[:database]

connection.drop_database(database)
connection.create_database(database)
connection.execute("use #{database}")

Rake::Task['db:migrate'].invoke
Rake::Task['db:fixtures:load'].invoke

end end

-Jonathan.

Well I guess where I'm not sure is what is the point of being able to roll back the schema independantly of the app, when the application can't cope with it.

In order for a db to be rolled back, you must use the latest version of you apps (trunk) migration files to go back to a version that is compatible with svn version xxx.

It's not like you can checkout svn version xxx and run the migration in it to roll back the database, since the new migration down methods aren't there.

Not exactly. If you wish to roll back from revision 9678 of your source to 9542, you can look in source control and see what the most recent migration revision was at the time of 9542's checkin. You can then migrate back down to that version, and THEN revert your codebase to revision 9542, and you will be in sync.

As far as looking at schema.rb for the overall view of your database, I stupidly was not aware/thinking of this.

I still think that in _development_, the use of multiple little migrations is a bad choice. Migration number collisions have happened multiple times that I've seen, and I do not think it is a people problem. If I want to create a new table (hence new migration), I shouldn't have to yell around the office, or circulate an email saying "Did anyone create a new migration they didn't check in? I'm making one now, so svn ci so we don't collide".

Just because I make a new migration, I may not want to check it in right away. What if I define a table, write the model code and a controller, work with it a little, and decide I want to lay the table out different? I haven't checked in, I should be able to just modify the migration. Write more code, test some more, change the table again, etc. When I feel satisfied enough to check in, only then should I have to.

The idea of "concurrent" version control is that I shouldn't have to be totally in sync with my team for everyting. If I check in a conflict, the tool (source control) will tell me that it can't merge it, and then I will use my brain to merge the files myself.

On that note, in the case of migration conflicts, if dev A checks in 012_add_videos.rb and dev B checks in 012_add_albums.rb, source control is fine with it. If the migration is one file, the conflict will usually be flagged during check in.

So far I'm much more convinced that a single migration in development makes much more sense, and the versioning functionality of multiple migrations makes a lot more sense in moving between stable releases. As for all of the people that talk of "throwing away the history", that's what source control is for - I can always look at previous revisions in the history to see previous changes.

I’ll pay that one :wink:

thanx for the response.

Hi !

01-Make a user migration <snip> 05-Modify user with a new migration <snip> 10-Add a column to user with validtaions in the model <snip> 11-other migrations that I don't want to loose

What you need to do is to add models INSIDE your migration:

class MigrateEstimateAddressesToAddress < ActiveRecord::Migration   def self.up     Estimate.find(:all, :conditions => ['address_line1 IS NOT NULL AND LENGTH(address_line1) > 0']).each do |estimate|       ...     end

    Address.update_all("addressable_type = 'Estimate'", "addressable_type LIKE 'MigrateEstimateAddressesToAddress%'")   end

  def self.down     Address.find(:all, :conditions => ["addressable_type = 'Estimate'"]).each do |address|       ...     end

    Address.delete_all("addressable_type = 'Estimate'")   end

  class Address < ActiveRecord::Base     belongs_to :addressable, :polymorphic => true   end

  class Estimate < ActiveRecord::Base     has_one :address, :as => :addressable   end end

I found this trick from Toolman Tim here: http://toolmantim.com/article/2006/2/23/migrating_with_models

This is a GREAT way to manage your models. Plus, you can create models just for your migrational needs. Maybe you don't need the full validation model, or you don't need the relations. It doesn't matter, this model is only used during migrations.

The only caveat is as described above. Polymorphic assocations must be manually corrected. STI doesn't seem to have this problem.

I agree though that plain version numbers can cut it for ordering migrations when multiple developers are used. I thought of using UTC timestamps instead, but it still doesn't help, because I need to run unrun migrations.

What we'd really need is a table that knew which migrations were applied. A "patch" oriented way of working, if you will. Then, we need dependencies between migrations... This is quickly becoming hard to manage.

What I really want is this:

class MigrateEstimateAddressesToAddress < ActiveRecord::Migration   depends_on :create_addresses, :create_estimates

  def self.up   end

  def self.down   end end

The DB should have a table such as this one:

CREATE TABLE migrations (   id INT PRIMARY KEY AUTO_INCREMENT,   name VARCHAR(255) NOT NULL,   applied_at DATETIME NOT NULL)

For dependencies, we'd need another table, obviously. You get the drift.

Hope that helps !

Ben Lisbakken wrote: > I totally agree - I do the same things as you. > > Also, it's annoying when I make a new migration (lets say 005) and then > my partner makes one too (his will also be 005!). If you only submit > code every few days then it gets tiresome renaming them.

That is a communication problem, not a migrations problem. You need to coordinate schema updates just like you have to avoid editing the same part of the same source file.

Yes, we have to do this every time. Its a "shout over the wall" kind of thing. In practice, we walk around and get a census before starting any migration.

To answer the original poster, I'd suggest some other way of enumerating the migration numbers. This could be like annotating the number with a userid :

007-edh_create_users.rb 007-mm_add_pay_date_to_payments.rb

This would alleviate the issue of duplicate names, at least. I realize it doesn't fix the problem of incompatible changes to the same table, but that is really a design team issue.

In my ideal world, I'd make the migration naming thing hookable. Then each team could devise their own coordinated enumeration scheme. A shared DB, for instance, to which you insert a new record every time you generate a new migration/model, the ID becomes the schema version #. Combine that with one of the other poster's rake task that does an update, a migration, then a commit all inside a transaction making the whole thing atomic.

Ed

It makes it very easy to edit migration files that have already been loaded into the development database. It wipes the slate and starts afresh. I’m never worried about the data in my development database as I usually use the test fixtures as development data, it’s standard practice for the projects I work on.

Without such a rake task, if I edit a migration that has already been loaded into the development database it can be a pain to rerun it with the changes. Often you can simply roll back to the previous migration then migrate up again, but that won’t work if you’ve also altered the downwards migration.

At least for me, it is much less work to just type rake db:reload

-Jonathan.