Rails 4.2 Foreign Key Naming

I’m really glad to see foreign key support has made its way into Rails migrations, but very quickly ran into an issue with it when working in teams using structure.sql.

Previously, we’d been using the foreigner gem, which generated FK names based on the table and column. This meant the name was consistent every time you ran the migration. With the version that made it into 4.2, part of the FK name is random. This means every dev who runs the migration locally generates a different version of structure.sql, leading to conflicts in version control.

Right now we’re working round this by explicitly setting the FK name in our migrations. I’d like to find a way of fixing this in Rails itself though.

My understanding was the reason for switching away from the old format was that the name is arbitrary, and can fall out of date on column renames. By itself I’m not totally convinced that’s enough reason to get rid of the more descriptive names (though I agree it’s debatable). The restriction on the length of the name (63 characters in Postgres, for example) is a more compelling reason for me.

Either way, I’d like to contribute a patch to make migrations generate consistent FK names. I’m happy to implement either of the following (or something else if anyone’s got a better ideas):

  • Switch back to generating human-readable names, accept that people with long table/column names will occasionally have to set them manually

  • Generate that human-readable name, hash it, and append a substring of the hash to “fk_rails_”. This would give you a name which is both short and consistent.

Would really appreciate feedback on those approaches.

Cheers,

Chris

Though I have not felt the pain, I have been a long time user of structure files in legacy environments and think consistent names are a good idea. On the flip side, the same could be said for constraint name “churn” too. I have seen this in big Oracle teams and unavoidable to my knowledge and/or only fixed with a custom script after the rake migration task.

All that said, I think this is something Rails could solve and the PG limit does seem like a big constraint too.

  • Cheers,

Ken

My vote would be to be consistent with the way index names work; the long human readable version.

-Amiel

As far as the Postgres limit, that also applies to index names. So I don’t see why we need to handle it any different for foreign keys (i.e., with indexes it just means that it fails on creation if the auto-generated one is too long and the dev has to provide a manual name.)

My own preference is towards the longer, descriptive names. Partly for consistency with index naming, and partly because it’s nice to be able to glance at them and know what they are (though I understand the argument for them falling out of date).

I’ll go ahead and make a PR for that option. If there’s serious objections to it, hashing is easy enough to add. Thanks for sharing your thoughts on this!

Cheers,

Chris

Hey Chris

Sorry for my late response.

tl;dr: I’d rather not go for consistency with index naming but for a consistent hash approach.

We have descriptive names for indexes because they are shown to the user in query plans.

This is not the case with foreign keys. When inspecting a database you can look at the

constraint metadata. I don’t see much value in encoding that metadata into the name.

In my opinion the drawbacks in the implementation are more sever. We faced a lot of issues

when automatically renaming indexes. There are still cases where they fall out of sync

with the actual metadata. Having descriptive names, which no longer represent the metadata

is way worse in my opinion.

Also constant renaming when renaming tables and columns is not very practical. The added

issue of the limited size makes this even worse.

My gut feeling is in favor of constant non-descriptive names. I do remember that I implemented

that exact behavior but we pivoted to purely random names. I’d like to go through our archived

discussions regarding that change to make sure we have all the needed information.

I haven’t had the time to do so.

Cheers,

– Yves

Hi Yves,

No worries. Glad to get another view.

Given that the information is close by anyway (in the constraint as you point out), perhaps it’s not worth the downside of the name falling out of date.

On the flipside it may be worth thinking about variation between databases. I’ve not used MySQL for a long time, but my understanding of foreign keys there (on InnoDB tables) is that they implicitly create an index if one doesn’t exist. Wouldn’t those also show up in query plans (with whatever name the foreign key has)?

I do agree that things being wrongly named is a serious issue. It can cause a whole bunch of wasted time/energy when debugging, and for that reason alone it may be best to avoid the descriptive names.

I’m happy to go either way on the implementation. It’d be great to see some of the original discussions first if you have links to them.

Cheers,

Chris

Future versions of PostgreSQL will use FKs within query plans, so
constraint names will need to be seen by user.

We’re looking at probably Sept 2015 for the first version that
supports this, which isn’t that far away.

They may not be shown in query plans, but they are shown in constraint error messages, and the possibility of hitting those errors is the whole reason one adds foreign key constraints. It’s helpful to have an indication which kind of constraint was invalidated without having to look through the schema.

We do already tolerate the index names being irrelevant if they become inconsistent, but in my personal experience that doesn’t happen very often at all, since I’ve found indexed columns tend to be important and keep their names.

So I would vote for meaningful names too. Just my 2c.

Yes, they are shown in constraint error messages, but so are the metadata (table name and referenced column name).
Following are example messages:

MySQL:

It’s a delicate decision, because if you do automatic human readable generation like in indexes then the expectation that renames should propagate is strong.

Maybe it would be also nice for the purposes of the discussion here to explain the actual gotchas we’ve faced doing that for indexes, so the trade-off is understood.

True - good point. Using random values is a bit weird though, since two people running the same migration will get different names, and then their schemas will never quite match. This means dev machines’ schemas will differ from staging/test environments and them from production.

Wouldn’t a simple CRC hash of the input data equally solve the issue of names potentially getting out of date, without introducing non-determinancy?

More way-too-late bikeshedding: IMHO the method in 4.2 should be renamed to add_foreign_key_with_constraint. A foreign key is just the key for an association. A foreign key constraint is a different concept to a foreign key.

IMHO it’s really important that users understand the difference, and only add constraints when they want them.

The thing is, if you live your life on recent versions of PostgreSQL, they work well enough to be usable. But on MySQL particularly - and in PostgreSQL before they implemented the special weakened lock types - they cause a lot of unintended side effects because they result in a standard shared lock being taken on the parent row.

On one of my big apps we put FKCs in in ~2008, and we’re now actually going through dropping FKCs when we get the chance, because they just cause too many problems at runtime.

Basically, if table A is a parent to table B, and you have an FKC linking the two, then an update to a B row will take some kind of shared lock on A. But it’s also common to have a form for A records which nests B records, and then in practice the #update action is going to save A, and then save the Bs… and then you get deadlocks.

I’ve been finding this sort of problem (there’s others) comes up so often that I’ve gone right off the idea of foreign key constraints, and now tell ppl to only add them when they specifically need them. (And I find usually it’s relatively rare that you do in practice because a lot of parent tables don’t get deletes, and those that do you tend to want to seralize things using update locks anyway.)

TL;DR I would like people to learn from our mistakes and not to accidentally get foreign key constraints when they think they are just adding a foreign key :).

I think changing the semantics of add_foreign_key is best saved for another discussion. The naming change seems to be contentious enough by itself. :wink:

A quick summary of the tradeoffs for human readable names:

  • Consistency with index naming

  • When they show up in query plans, they’ll be understandable

  • They can fall out of date

  • They can be too long and require manual naming

To my mind, the code for renaming them shouldn’t be a huge leap from the same code that’s there for indexes. The most obvious issue is support for constraint renaming in different databases. Postgres only has this in 9.2+, so we’d have to do version checks. By contrast index renaming support goes back to at least 8.0. I have no idea about MySQL.

That said, the foreigner gem (which this built-in functionality replaces) has been generating human readable FK names without renaming them for ages. I’d be really interested in knowing whether this was a major issue for people using it (other than me). From what I can see, this got brought up once (https://github.com/matthuhiggins/foreigner/issues/141) and wasn’t discussed.

I’m getting the feeling this is an issue which just needs someone who’s in charge of activerecord to make a call. I’d like to have the readable names back myself, but it’s not my call and opinions are pretty divided on here.

I’m not convinced that we need human readable foreign key names right now.
Since 4.2. is out with this issue, for the time being I’d like to focus on a backportable solution.

A pull request with deterministic names but still non-human readable is very welcome.

This should be a fairly controlled change that we can bring back to 4.2.x to solve the issue.

We’re open to consider human readable names if they become more relevant in the future (what Simon Riggs suggests).

If you happen to know more about these plans, please keep us posted so we can adjust accordingly.

@Chris Sinjakli: Thank you for your patience. Please let me know if you are up to work on a PR.

Otherwise it would be good to file an issue on GitHub so that someone else can take over.

Cheers,

– Yves Senn

Sounds good. Glad to have a decision on this so we can get away from random names.

I’m happy to do the work and get a PR in. Should hopefully have some time in the next week.

Cheers,

Chris

For reference, that PR is now open.