The best way to specify foreign keys?

You can read in "Agile Web Development with Rails 2nd ed." that
migrations don't provide a database-independent way to specify foreign
key constraints, so we have to execute native DDL statements, like
this of MySQL:
execute "alter table line_items add constraint fk_line_item_products
foreign key (product_id) references products(id)"

Is this still the only way to specify foreign keys restrictions?

Thanks!

As far as I know, yes. I wrote a small helper function to do it
though (PostgreSQL syntax):

def foreign_key(foreign_table, foreign_column, primary_table,
primary_column = :id)
  execute "
    alter table #{foreign_table.to_s}
      add constraint fk_#{foreign_table.to_s}_#{foreign_column.to_s}
      foreign key (#{foreign_column.to_s}) references #
{primary_table.to_s} (#{primary_column.to_s})
  "
end

In my migrations, I call it with

foreign_key :orders, :customer_id, :customers

Peace,
Phillip

There's foreign_key_migrations:

   http://www.redhillonrails.org/

I use it in all projects whose database supports FKs.

-- fxn

Is there any weblink to the book to download for free " Agile Web Development with Rails 2nd ed. "

No, it is not available for free. But it is worth every penny you pay for it. I use it all the time.

Peace,
Phillip

Thanks Phillip I will take the book :slight_smile:

I also recommend the plugin from http://www.redhillonrails.org/ for
handling foreign keys.

-Katie

The language can't read you mind and assign the foreign key for the
right parent object...that needs to be done through code when you create
or edit an object , right?

The plugin foreign_key_migrations is very transparent. It relies on standard column naming conventions and provides :references for the rest.

I'm so clueless I can't even imagine what potential problem the extra
effort is heading off.

A couple of benefits: FKs help getting your :dependents right, and as a side-effect you normally (I don't know whether this is universal) get an index on the FK column that speeds up some queries.

-- fxn

The point is: if for example you forget to add :dependent => :destroy to a has_* declaration that should have one (which you will at some point, I promise :slight_smile: the application or the corresponding test _dies miserably_ if you try to delete the parent object. That's because the FK raises an error from the very database, a row with a FK is invalid if the corresponding foreign record does not exist.

Thus, you increase the chances of discovering that bug.

-- fxn

A couple of benefits: FKs help getting your :dependents right, and as
a side-effect you normally (I don't know whether this is universal)
get an index on the FK column that speeds up some queries.

-- fxn

What database do you use that creates an index on a foreign key? The databases that I'm familiar with, primarily SQL Server and PostgreSQL, do *not* automatically create an index on a foreign keyed column. They only do that on primary keys because the unique index is used in the enforcing of the of the primary key constraint. A foreign key is merely a *constraint*, meaning that the database is going to enforce a rule against the population of that column, and in this particular case, the data going into it must already exist in the table/column referenced.

Peace,
Phillip

There are varying opinions about what role a database should play in your application. On one end of the spectrum is the idea that a database is nothing more than a place to store data and all constraints and rules should be enforced in the application. At the other end is the idea that the database should contain as much business knowledge as possible, with the primary intent being the protection of the integrity of the data, and with a secondary intent of isolating the logic in one place. In the first case, each client that accesses the data must implement the same rules, and any variation (or gaps) jeopardizes the value of the data. In the second, the rules are already in place, so new clients don't have to worry about it.

Representing these various positions are people who tend to be very passionate about what they do and believe, so you will often find extremely opinionated conversations. I definitely agree with the idea that a database application is only as good as the data it manipulates. And you can't beat a database engine for enforcing database rules. So I'm am a firm believer in indexing, constraints, and stored procedures (where appropriate). I have, however, seen databases get bogged down by developers trying to do things in them that are better suited to application code. But that's part of the debate that I try very diligently to stay out of :).

But don't just take my word (or anyone else's) for it. Get some experience and decide for yourself.

Peace,
Phillip

It happens in MySQL with InnoDB[*]:

"InnoDB requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. The index on the foreign key is created automatically."

But looks like that is perhaps more exceptional than normal.

-- fxn

[*] http://dev.mysql.com/doc/refman/5.0/en/innodb-foreign-key-constraints.html

That's interesting. PostgreSQL requires the *referenced* column to
have a unique constraint or be part of a key, but it makes no
requirement on the foreign key. I'm not a database engineer (dirty
guts of the things), but I don't see how having an index on the
foreign key provides any benefit in enforcing the constraint. The
lookup is going to happen in the referenced table, not the foreign
table.

I am not positive on this, but I don't believe SQL Server requires a
key on the referenced column at all. Given a small table, that
wouldn't make much of a difference, but that'd be huge on something
like an order_detail to orders relationship.

Peace,
Phillip

Phillip Koebbe wrote:

And you can't beat a database engine for enforcing database rules.

In fact you pretty can't avoid it unless you don't rely on constraints being enforced or have huge amounts of dev resources to throw at the problem.

I believe there is no way of avoiding associations being broken (leaving invalid references) when you only declare them at the application level unless you pretty much reimplement the whole transactional and integrity constraints logic in a distributed manner in the ORM. To keep things sane you simply have to enforce them at the database level.
One example is when you delete an object in a process while creating a child for it in another. There is no way which isn't insanely complex that you can guarantee that the child won't be created with an invalid parent_id column value unless you use the database to enforce it. The simplest way of enforcing this is using foreign keys.

If you really really want to, you could use an archive system for the whole content of the database, never truly deleting objects only marking them 'archived'. But you'll have to code your whole application to support this 'archived' status. It can make sense in some situation, in fact I'm starting a project where I'll have to use this technique for some object types to enforce some business rules, trust me it doesn't make things simpler.

Keep things sane: use Rails validations as your first line of defense with user-friendly errors, then for the messy race conditions let the database handle them by throwing exceptions (they can be show stoppers corrupting your whole database, but they are admittedly relatively rare cases and a simple page reload usually is enough).

Lionel

Phillip Koebbe wrote:

That's interesting. PostgreSQL requires the *referenced* column to have a unique constraint or be part of a key, but it makes no requirement on the foreign key. I'm not a database engineer (dirty guts of the things), but I don't see how having an index on the foreign key provides any benefit in enforcing the constraint. The lookup is going to happen in the referenced table, not the foreign table.
  
There's now benefit when you create a reference, but... think 'DELETE' of the referenced row and you will see a speed benefit :slight_smile:

If you INSERT into a child table you need the to check the parent key exists. An index in the referenced column is good then, right. But if you DELETE a row in the parent table a check for that key in the child table is performed, so an index is good there as well. As far as speeding up the constraint check itself I think an index makes sense on both ends (at least conceptually, I am not a database guy).

On the other side performance tips mention that an index is recommended on foreign keys to speed up joins, so I think that's a fine feature. I guess other databases do not automate index creation to be more flexible.

-- fxn

Ah, the DELETE. Didn't think of that. Thanks to you and Lionel for
reminding me.

On the issue of indexing speeding up queries, that's generally true,
but you have to think about which way your queries are going and
where the look ups are going to be. If you have a list of order
types and an order_type_id in the orders table, the index on
order_type_id is only going to speed up queries when you join from
order_types to orders on order_type_id. If you are querying all
orders and displaying their order_type, such as

select orders.number, order_types.name
from orders
inner join order_types
  on order_types.id = orders.order_type_id

the index on order_type_id doesn't do you any good as it's the
primary key column of order_types that is getting searched. But if
you want to do a count of all orders for each order type, that index
on order_type_id will be orders (no pun intended) of magnitude faster
than without it.

I was working on a database recently that over 2600 indexes for 600-
ish tables. I never had the opportunity to do any research to see
which ones were really necessary, but I always wondered how many
weren't.

Peace,
Phillip