I'm a bit concerned about that too, so I've been sticking every piece of
validation I can into the database layer as opposed to in ruby code.
To be fair, having referential integrity in the ruby side shouldn't slow
things down, although it does make your database more fragile; references
arent loaded unless you ask them to be. A regular index on a join column
would be just as fast as a foreign key index, possibly faster on
updates/deletes.
However, you lose a lot of the *integrity* part of that if you leave this
up to rails. The main reasons referential integrity and constraints exist
are to to defend your data layer from bugs in the application layer. Stuff
like preventing a customer that has an order pending from being deleted, or
making sure an invoice can not be saved with a total of less than one cent.
Sure, you can code all of this into the ruby side, but then if you're
working on something by hand (in script/console or doing raw SQL queries)
you can still mess things up pretty badly.
The way rails handles transactions does bug me a bit though. If I need to
do a complex sequence of changes in a database as one transaction, I will
probably do it as a stored procedure on the database side and just call out
to that. I'm already doing this for "side effects" in one of my
applications.... in several places, but here's one example: I have a table
for hostnames, and a table for domain names. Hosts belong to Domains. Hosts
have a "full_name" column that combines their name with the name of their
parent (eg; host "foo" in domain "bar.com" gets the full_name
"foo.bar.com"). Instead of coding this as a before_save action on the rails
side, I made it a "BEFORE INSERT OR UPDATE" trigger on the postgresql side.
I think leaving as much of the data massaging/validating up to the
database as possible is not just a bit faster, but also probably a lot
safer.
Rails needs to provide these features because not all database engines
support them (eg; SQLite)... but if something to do with the data model can
be done in the database, I think it should. That *doesn't* mean that you
should do something in the database layer JUST because you can. The database
should only been concerned with handling the data. I've seen stored
procedures that actually do things like send email, and that, well, scares
me.
Cheers,
Tyler
Err, that's what the directory marked 'test' is for.
All you are doing here is stating that you don't trust your testing
regime and loading your database with unnecessary checks.
The main reasons referential integrity and constraints exist
It doesn't matter who I trust. The *database* should not trust the
application. The database is the custodian of the data, and if it lets
the data get corrupted by outside hands, that's it's fault and something
that it could have prevented.
The "application" isn't always the rails app, either. It might be you
going in and doing queries manually. And there are numerous bugs that would
not get caught by conventional unit testing, that would get caught by a
strong database layer - for instance, a dispatcher process getting 'kill
-9'ed in the middle of a complex set of data manipulation (which really
should have been a stored procedure on the SQL side in the first place...)
Cheers,
Tyler
We can go back and forth on this until the end of time. There’s obviously no “God-ordained right way” to deal with databases and referential integrity (see the eBay link above) no matter how hard people push each side. If you have multiple applications hitting the same database, then yes, you’ll pretty much need DB integrity built into your tables, otherwise it’s really a personal choice taking into account usages, scaling and what not.
Personally, I HATE stored procedures as they, depending on the DBMS, are quite difficult to write tests for and add another external dependency on the application.
I wonder if the RevolutionHealth has anything to say on this subject. I’m gonna search their blog, see what’s up.
Jason
You're right. The original concern was that the lack of RI/transactions on
the DB side would cause scaling problems. I guess the point I was trying to
illustrate was, that when you are developing in rails, you can still take
advantage of pretty much everything your RDBMS has to offer.
Cheers,
Tyler
Tyler MacDonald wrote the following on 30.05.2007 18:44 :
Err, that's what the directory marked 'test' is for.
All you are doing here is stating that you don't trust your testing
regime and loading your database with unnecessary checks.
It doesn't matter who I trust. The *database* should not trust the
application.
Then code your application in the database. See you next century
The database is the custodian of the data, and if it lets
the data get corrupted by outside hands, that's it's fault and something
that it could have prevented.
I don't think you can reasonably describe all constraints on your data
in the database. This probably could be done with trigers and stored
procedure but would be a maintenance hell. Next thing you would push the
access control to the database layer too (some have tried and this has
been a nightmare for them : RDBMs supports for groups and acls is not
standardised and make migrations across them impossible, everytime the
user/group attributes aren't enough and you must add other tables to
store basic informations which complicates your object model, ...).
At this time the best thing in my opinion is to combine the
transactional capabilities and foreign keys support of RDBMS with
validation checks out of the database (which makes it far easier to
report errors to users BTW, a RDBMs raw error string isn't exactly
useful to the end-users).
Putting your validation code out of the DB means that migrating your DB
to new constraints is far easier.
I've done several migrations where constraints that were true in the
previous application versions weren't anymore which meant some data
became invalid, it's far easier to handle this problem with external
code : you can run all your validating code on current data while it's
still in use just to measure the extent of the problem and evaluate the
time needed to reformat some columns for example. With constraints in DB
you have far less friendly tools : the DB can refuse changing
constraints due to invalid data or accept them without a posteriori
problem detection (when the check is only done at INSERT or UPDATE time
for example) -> your data becomes broken. Solving this is possible but
far more difficult than it is at the application level.
The "application" isn't always the rails app, either. It might be you
going in and doing queries manually.
I agree you're far better off with only one consistant mean of accessing
your data. But that doesn't mean it must be the DB, with Rails you are
usually advised to provide a REST interface in order to make other
applications use the same validation code. The script/console tool
replaces the db command-line tool with the same benefit.
And there are numerous bugs that would
not get caught by conventional unit testing, that would get caught by a
strong database layer - for instance, a dispatcher process getting 'kill
-9'ed in the middle of a complex set of data manipulation (which really
should have been a stored procedure on the SQL side in the first place...)
Rails supports transactions, a "kill -9" doesn't harm your data unless
you are working outside transactions, no need for stored procedures,
point blank.
Lionel