Support for deferrable constraints in PostgreSQLAdapter#disable_referential_integrity

I recently submitted a pull-request

(https://github.com/rails/rails/pull/10939) to address a bug in

PostgreSQLAdapter#disable_referential_integrity related to how the

existing code behaves in a transaction when the PostgreSQL user account

lacks superuser privileges on the PostgreSQL instance.

While this permits the code to function without crashing, it doesn’t

actually make #disable_referential_integrity work unless the user has

superuser privileges (which apply to the entire PostgreSQL instance).

There may be perfectly valid reasons for a developer not to have

superuser privs on the PostgreSQL instance on their development machine,

and I would like to explore options for resolving this issue.

The first item to discuss is the purpose of

#disable_referential_integrity. Technically, the method name implies

that the only thing it is supposed to do is to disable referential

integrity. Referential integrity is but one specific case of a whole set

of database level integrity constraints (including primary keys,

uniqueness constraints, etc.). Few databases support SQL 92 assertions

(see

http://stackoverflow.com/questions/6368349/why-dont-dbmss-support-assertion),

so most developers/DBAs that want more complicated database level

integrity constraints implement them via triggers. In addition, triggers

can also be used to implement automatic caching of computer or derived

data within the database (for instance, updating a table that keeps

track of all direct and indirect reporting pairs for employees in an

organization).

I would suggest that what we really mean by

#disable_referential_integrity is "disable all constraints and automatic

updates that affect more than one row in the database so that I can do

bulk database changes without running into issues sequencing the updates

because I need to make a set of changes that will (or should) leave the

database in a consistent state". It just happens to be that referential

integrity is the most common of these constraints, and so the method was

given its current name. I see no need to change the name, but being

clear about what we mean by it is important.

Now on to the PostgreSQL specifics. PostgreSQL supports two approaches

for disabling constraints and triggers. The first is to use ALTER TABLE

table DISABLE TRIGGER ALL. This requires superuser privileges and

disables both user and system created triggers. The system created

triggers are used by PostgreSQL to implement foreign key contraints (as

well as some other types of constraints). The user created triggers

could have been created by the user for any purpose.

The current implementation of #disable_referential_integrity attempts to

fall back to ALTER TABLE table DISABLE TRIGGER USER for scenarios where

the user lacks superuser privileges. This will disable user created

triggers, but leaves alone the system created triggers used by

PostgreSQL to implement foreign key constraints. This means the foreign

key constraints are still being enforced, and while

#disable_referential_integrity doesn’t raise any errors, referential

integrity is not being disabled!

There is another approach that doesn’t require superuser privs, and that

is to use SET CONSTRAINTS ALL DEFERRED, which defers evaluation of

constraints until the end of the current transaction. This approach has

two requirements. First, it must be issued within a transaction - if

there is no current transaction, the statement has no effect. Second,

the constraints must have been created with the DEFERRABLE parameter.

There is one major advantage to this approach, and that is that the

constraints are verified when the transaction ends. Unlike the Oracle

behavior documented in adapter_test.rb, PostgreSQL does not appear to

verify the referential integrity triggers when they are re-enabled via

ALTER TABLE table ENABLE TRIGGER ALL.

Here are the various scenario options that need to be considered when

deciding what behavior PostgreSQLAdapter#disable_referential_integrity

should implement:

  • User has SUPERUSER vs. User does not have SUPERUSER

  • FOREIGN KEY constraints were defined without DEFERRABLE vs.

defined with DEFERRABLE

  • User has declared additional constraints via user triggers vs.

no user triggers

  • The call was made from within a transaction vs. outside a transaction

Part of the problem in crafting the code is that we don’t have any easy

way of knowing which of the 16 possible scenarios the code faces, but we

can do our best. Also, while it might be nice to provide more control to

the user over options such as whether or not user triggers are disabled,

I’d rather not complicate the interface. Finally, while it would be nice

to verify referential integrity at the end of the block, I don’t think

that takes precedence over ensuring reliable code under a wide variety

of scenarios.

My suggestions are as follows:

  • If the user has SUPERUSER, disable all triggers. This ensures we can

disable referential integrity even if the constraints weren’t declared

with DEFERRABLE, so long as the user has SUPERUSER.

  • If the user doesn’t have SUPERUSER, disable all user triggers and then

use SET CONSTRAINTS ALL DEFERRED.

  • If the call was made from within a transaction, there is no need to

create a nested transaction.

  • If the call was not made from within a transaction, create one

before calling SET CONSTRAINTS ALL DEFERRED and yield to the block

from within that transaction.

While this results in three possible code paths depending upon the

scenario, I believe it gets us as close as possible to the intent of

#disable_referential_integrity under all the scenarios.

I would be happy to implement this approach and to develop a reasonably

comprehensive test suite to validate it (starting with the test suite,

of course). Should I proceed? Is there any feedback before I start?

–Toby Ovod-Everett

P.S. I don’t intend to open discussion on whether support for

referential integrity belongs in ActiveRecord. I personally am happy

with the current situation. There are at least two 3rd-party modules

that provide this support (foreigner and schema_plus), and I think it is

useful to encourage innovation in this space. I do, however, strongly

believe that #disable_referential_integrity belongs in ActiveRecord

because it exposes an interface to both internal code (i.e. fixtures)

and 3rd-party code (database_cleaner, yaml_db, etc.) that need this

support, and centralizing that improves the chances of both getting the

behavior implemented correctly and simplifying life for the code that

needs this support to tap into it.