I cannot help but believe that matters are headed in exactly the wrong
direction. We already have a serious problem (as discussed in AWDR)
with the asynchronous relationship between code and the current
migration level. Now there is talk about creating ANOTHER
asynchronous set of tasks which depends on both code and migration?
The creation of an administrative user is very tightly bound to the
creating of the users table. Separating them looks unnatural,
difficult, and errorprone to me. I expect that there are similar
situations in other circumstances.
I've tried to adopt DHH's approach of using migrations just as a
change agent and relying on the schema dumper to reveal the state.
What I've found lacking in this approach is that the schema dumper
shows the current schema state AS IS, not AS INTENDED. I'm not
perfect in my practices, and perhaps there is a best practice where
the AS IS (in development) always matches the AS INTENDED (what I want
to test and eventually put into production). But until I find such a
practice, schema dumper doesn't do it for me -and that's before we
even get into the seed data issue.
Migrations definitely serve the purpose for transitory changes. But
they also do a superb job of defining a self-documenting AS-INTENDED
schema using a well-understood and flexible syntax that even supports
loading of "seed" data. Is it possible that migrations, while only
intended to do the job of transitory schema changes, has also turned
out to be the best available tool for building a database from
scratch?
-------- Extra Notes on my practices, feel free to critique ---------
1. I build my initial database (including some minimal seed data) by
creating a 001_baseline.rb migration.
2. Over time, I adjust the schema (and data) with migrations
0nn_<change>.rb
3. When the number and complexity of the migrations begins to get
unwieldy, I condense my migrations into a new baseline. For this
step, I rely heavily on the output of the schema dumper (:sql). I
think this step is essential to good documentation of the AS INTENDED
schema because the net result of a long sequence of meandering
migrations can be difficult to grasp.
In an ideal world, I would have (in priority order):
(a) support for the testing of migrations -today you need to either a
plugin or many steps to test with migrations.
(b) more explicit support for using fixtures in migrations (AWDR shows
how, but it could be cleaner).
(c) support for condensing my migrations into a new baseline.
Something like turning over a new DB epoch, with an epoch marker.
I don't think I understand this. Why do you want or need to continuously test the migrations?
Let me try to explain.
* there is no up-to-date development environment on the continuous
integration box
* but I do want to rebuild the database from scratch in every CI
build. From what?
* if I use db/schema.rb, I am relying on an artefact that was
automatically generated in somebody's development environment. Which
is not how it will be done in production. I also cannot expect
everybody to always pay attention when checking-in auto-generated
artefacts.
* running all migrations then looks like a better choice.
Once everyone has been moved, the migrations are useless and could essentially be deleted.
Yeah, having many migrations floating around is awkward, too. One can
take schema dumper output and make a new baseline migration out of it,
with the same number as the DB_VERSION in the last prod release.
> I don't think I understand this. Why do you want or need to continuously test the migrations?
Let me try to explain.
* there is no up-to-date development environment on the continuous
integration box
* but I do want to rebuild the database from scratch in every CI
build. From what?
Yes, dropping the db and migrating everything is the sensible approach in CI.
* if I use db/schema.rb, I am relying on an artefact that was
automatically generated in somebody's development environment. Which
is not how it will be done in production. I also cannot expect
everybody to always pay attention when checking-in auto-generated
artefacts.
Right. In general, I think it's bad practice to check in generated
artifacts. It's more work for everyone to remember to check in when
they change the schema. It's error prone and people often forget to
check in, which means the CI build breaks and you waste time figuring
out what broke, who forgot to check in, and blaming them. Better to
just always run the migrations in CI and svn:ignore schema.rb.
* running all migrations then looks like a better choice.
> Once everyone has been moved, the migrations are useless and could essentially be deleted.
Yeah, having many migrations floating around is awkward, too. One can
take schema dumper output and make a new baseline migration out of it,
with the same number as the DB_VERSION in the last prod release.
Yep. Speeds up CI and clean DB setups too if there are fewer
migrations. If there were something to automate the collapse of
migrations into a schema dump up to a given version, that would be
great. It's easy to do manually, but it would be a nifty rake task
for someone to publish.