rescuing ActiveRecord::RecordNotUnique: clever or ugly?

I have an ActiveRecord that has several foreign keys, where the foreign keys act as a single compound key that needs to be unique:

class CreateRelations < ActiveRecord::Migration   def change     create_table :relations do |t|       t.references :src, :null => false       t.references :dst, :null => false     end     add_index :relations, [:src_id, :dst_id], :unique => true   end end

One obvious way to enforce uniqueness would be using first_or_create:

def self.create_relation(src, dst)   where(:src_id => src.id, :dst_id => dst.id).first_or_create! end

Which works, but that always does a SELECT before creating the record. Since I don't expect users to attempt to create duplicates very often, I've written a the method to catch RecordNotUnique errors, like this:

def self.create_relation(src, dst)   create!(:src => src, :dst => dst) rescue ActiveRecord::RecordNotUnique   where(:src_id => src.id, :dst_id => dst.id) end

So mine is a question of style: is this considered an abuse of rescue? Or is how seasoned Rails programmer would do this?

- ff

I have an ActiveRecord that has several foreign keys, where the foreign keys act as a single compound key that needs to be unique:

class CreateRelations < ActiveRecord::Migration def change create_table :relations do |t| t.references :src, :null => false t.references :dst, :null => false end add_index :relations, [:src_id, :dst_id], :unique => true end end

One obvious way to enforce uniqueness would be using first_or_create:

def self.create_relation(src, dst) where(:src_id => src.id, :dst_id => dst.id).first_or_create! end

Which works, but that always does a SELECT before creating the record.

A custom validation on the compound unique key would behave similar, always do a SELECT (and suffer from the race condition documented in http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of ).

Since I don't expect users to attempt to create duplicates very often, I've written a the method to catch RecordNotUnique errors, like this:

def self.create_relation(src, dst) create!(:src => src, :dst => dst) rescue ActiveRecord::RecordNotUnique where(:src_id => src.id, :dst_id => dst.id) end

So mine is a question of style: is this considered an abuse of rescue? Or is how seasoned Rails programmer would do this?

If you want to be _sure_ of the uniqueness, you need to do this anyway (catch the exception thrown by the database uniqueness validation to resolve the race condition).

Then the use of a validation of uniqueness is merely an optimization to catch the non-uniqueness problem earlier.

And indeed, if the chance for double inserts is low, it may be more efficient to not do the SELECT that is triggered by a uniqueness validation, do an "optimistic" save and fall back it the database tells you it fails.

Just my 2p.

Peter

Peter Vandenabeele wrote in post #1052874:

A custom validation on the compound unique key would behave similar, always do a SELECT (and suffer from the race condition documented in

http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of

).

I was aware of the race condition in the original post, which is why I thought this might be a 'clever' implementation. To merge what @Peter said, I guess you can summarize it as:

If the likelihood of duplicate entries is low, let the db raise an error -- this saves a SELECT call for non-duplicate entries:

    def self.create_relation(src, dst)       create!(:src_id => src.id, :dst_id => dst.id)     rescue ActiveRecord::RecordNotUnique       where(:src_id => src.id, :dst_id => dst.id)     end

If the likelihood of duplicate entries is high, use first_or_create as a first line of defense. This generates an extra SELECT, but (presumably) that's cheaper than frequent calls to raise/rescue[*]. Regardless, keep the rescue clause to avoid race conditions:

    def self.create_relation(src, dst)       where(:src_id => src.id, :dst_id => dst.id).first_or_create!     rescue ActiveRecord::RecordNotUnique       where(:src_id => src.id, :dst_id => dst.id)     end

[*] This begs for a benchmarking test to compare the cost of a SELECT vs raise/rescue. I'll put it on the list.

Fearless Fool wrote in post #1053062:

Peter Vandenabeele wrote in post #1052874:

A custom validation on the compound unique key would behave similar, always do a SELECT (and suffer from the race condition documented in

http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of

).

I was aware of the race condition in the original post, which is why I thought this might be a 'clever' implementation. To merge what @Peter said, I guess you can summarize it as:

If the likelihood of duplicate entries is low, let the db raise an error -- this saves a SELECT call for non-duplicate entries:

    def self.create_relation(src, dst)       create!(:src_id => src.id, :dst_id => dst.id)     rescue ActiveRecord::RecordNotUnique       where(:src_id => src.id, :dst_id => dst.id)     end

If the likelihood of duplicate entries is high, use first_or_create as a first line of defense. This generates an extra SELECT, but (presumably) that's cheaper than frequent calls to raise/rescue[*].

A SQL query takes around 1 ms, possibly more if the db server is on a different machine. That is way more expensive than rescuing an exception.

Joe

Joe V. wrote in post #1053641:

A SQL query takes around 1 ms, possibly more if the db server is on a different machine. That is way more expensive than rescuing an exception.

Joe: Thanks for the useful data point. Just to make sure my understanding is complete, since any db transaction is going to take 1ms (more or less), you would ALWAYS let the db catch the duplicate and avoid the extra SELECT:

    def self.create_relation(src, dst)       create!(:src_id => src.id, :dst_id => dst.id)     rescue ActiveRecord::RecordNotUnique       where(:src_id => src.id, :dst_id => dst.id)     end

Yes?

Let me redact that previous post...my only excuse is that I hadn't had my coffee yet!

If the time to make a DB query (let's call it 1 Unit of time) swamps out any amount of processing in Ruby, then:

Approach A (do a SELECT to check for existence of the record before attempting an insert) will take 1 Unit if the record exists and 2 Units if the record does not exist.

Approach B (always attempt the insert, let the DB signal an error, recover by doing a SELECT) will take 2 Units if the record exists and 1 Unit if the record does not exist.

So I'd conclude that:

If the chance of a duplicate insert is less than 50%, you're better off with Approach B, otherwise you should choose Approach A.

Yes?

I don't know for certain but I imagine a db write takes much longer than a db read, maybe 10 units for the write and one for the read. If so then that changes the calculation somewhat. However there is also the question of whether db access time is significant in the app anyway. Bottlenecks in apps are usually not where you expect them to be. I would do it the way that seems cleanest to you and only worry about optimisation if throughput becomes an issue, in which case you will have to do some profiling to find where the bottlenecks are.

Colin