Rob Biedenharn wrote:
[...]
Matt, Pito, Marnen, and anyone else,
1. The opinion on whether db/schema.rb goes into the source repository
has changed over time.
No. I've used Rails since 1.2.6. Every version has put a comment in
the schema.rb file that recommends putting it into version control.
I've been using Rails since 0.13 and I'll restate that this opinion has changed.
I can't find it at the moment, but there WAS a version that specifically said in
that comment to NOT put the file into source control. At least in 1.2.2, there
is no comment either way.
It is an opinion. (I'll give you some of mine.)
Opinion is not an excuse for advocating dangerous practices.
"Dangerous"? Well, that's certainly an opinion.
2. When properly written, migration files are not "prone to having
problems" and certainly do not have to get worse over time. The advise
here is to define at least a minimal version of your AR model class
nested within the migration class if you need to do *any* manipulation
at the model level.
But you should never be running old migrations in the first place. If
you need version 1000 of the schema for a new installation, then don't
start at zero and run 1000 migrations -- just do rake db:schema:load and
have done with it. This is the core team's recommendation, and I think
it's a good one.
I have helped other developers who have created several migrations in development, which were applied approximately when created, that were subsequently unable to run when deployed to production. The very same recommendations that guard against this kind of problem will make it possible to run those 1000 migrations (or any subsequence) without problem.
3. I think that db/schema.rb does *NOT* belong in the repository.
Why not? Without it, you can't load the schema with Rake.
This is particularly true in a multi-developer environment where
migrations are being initially created on different branches.
I don't see how this makes a difference. Your VCS should be able to
merge the schema.rb files. If not, get a better VCS.
Yikes! No! The database itself holds the official version of the schema. If I merge changes from a master branch into my development branch, I will run any new migrations, but I certainly don't want some merge tool to give me a new schema.rb. Depending on the actual content of the migrations on different branches and the order in which they are run, the *actual* schema might be slightly different due to the rules for where new columns are placed on a table.
The
database itself isn't going into the repository after all.
That's a red herring.
If you need
the file, run a rake db:schema:dump or just run migrations.
No! You need the file so that rake db:schema:load is possible. You've
got it completely backwards.
You assume that I need to run db:schema:load, which *I* don't.
If you
think about why the migration numbering (file naming) was changed from
sequence number to timestamp, you'll realize that the practice of such
"interleaved" migrations was a much bigger pain-point than what to do
about db/schema.rb.
I don't really understand what you're getting at here.
When migrations were sequentially numbered, two developers on separate branches might both create migration 005 for different purposes. This was a problem. The chance that two developers both create migration 20100422105524 is acceptable small. Of course, they will probably be executed in a different order and if they both add a column to the same table, those columns will likewise be in a different order. If db/schema.rb is in the repository, then lots of commits will have effectively meaningless changes and unless the current HEAD has *my* version is it *not* going to truly represent what's in my database schema.
4. Unless you're initializing (inserting) data via migration, running
all the migrations is really not much different than doing a
db:schema:load because all the migrations are operating on empty
tables.
How can you say this with a straight face? There is no reason *at all*
to run lots of migrations rather than doing a simple schema load.
The reason is deploying to an existing production database. You can't do a schema load. The proper way to apply the "new" migrations (which might be kinda old if the last production deploy wasn't so recent) is a db:migrate. As I've stated, you can get into trouble with mismatches between migrations (which don't change after being created and *shouldn't* if created properly) and models (which obviously change over time).
I have projects with 210 and 204 migrations as well as many with fewer. Some of the migrations deal with rather nasty data manipulations to maintain data relationships when the associations are flipped around. It's not something that I would recommend, but the definition of "reasonable" can change dramatically when a client shifts the way he thinks about the data and its evolution.
As an experiment, I set up a new environment for the 204 migration project and ran the migrations from scratch. It takes about 8 minutes. There is about 1 minute of startup time, there are several migrations that load some data including one that takes a bit over 3 minutes to put a few tens of thousands of research datapoints into a set of tables. I'm OK with that amount of time for something as significant as creating a new environment.
(Besides, if you have to "scale up your app", you probably
aren't adding a new empty database, but creating a master-slave or
sharding for performance.)
Another red herring.
Well, if you can "scale up your app" by starting from an empty schema, go ahead. Perhaps initializing the shards, but then I'd start with the db/schema.rb from production, not something from the repository which almost certainly reflects a development environment however close to production that might be.
Rob, I know you know a lot about the Rails framework, but your advice
here will make dealing with databases far more difficult than it needs
to be.
If I help someone who recalls some of these nuggets of my wisdom and experience at a time where migrations give them trouble, then I will have made a positive difference.
However, it is that same experience that has led me to the conclusion that keeping db/schema.rb in the source repository is wrong. It is derived data and I would no more put it into the repository than I would have someone put their object files compiled from C or their class files compiled from Java in there.
-Rob
Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com
Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org
--
I'm sure we'll meet again! Take care,
-Rob
Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com