Summary
I would like to update the active record create/create! methods to take an option to perform the insert query with ON CONFLICT …. Currently performing inserts with ON CONFLICT is possible with insert / insert_all though it would be beneficial for codebases to be able to use this database feature while leveraging the existing model lifecycle features like callbacks and validations.
Additionally the create_or_find_by method can be updated to use ON CONFLICT as well which will remove the need to create a new subtransaction. This is particuarly helpful for Postgres which suffers from performance degradation when many savepoints (> 64) are created within the same transaction. As it dramatically increases disk I/O. See: Shane Borden’s Technology Blog - Do PostgreSQL Sub-Transactions Hurt Performance?
Background
I work on a large rails application created in 2017 which uses Postgres as a database. We’ve recently been working on database performance improvements and we’ve noticed that we’re creating a large amount of savepoints at times due to using find_or_create_by. This is a fairly common pattern as we need to handle race conditions fairly frequently and we may be creating many records in a loop within the same transaction.
While we can sometimes update code to use direct to DB methods that skip creating an active record model (like insert or insert_all) it would take a long time to update all of our code and care would need to be provided to not create bugs as a result of callbacks and validations not running.
I was investigating implementing a drop in replacement to create_or_find_by! within our codebase that executes the model lifecycle, I have a working example which inserts the record via insert and so it uses ON CONFLICT DO NOTHING. It also executes validations, callbacks and uses the models methods and associations:
def create_or_find_by!(attributes, &block)
record = new(attributes, &block)
# Perform active record validations, raise on failure!
record.validate!
# Autosave any non persisted models that are passed in before we continue!
autosave_association_method_calls =
(
(reflections.keys & attributes.keys.map(&:to_s)).map do |key|
method_name = :"autosave_associated_records_for_#{key}"
method_name if record.respond_to?(method_name)
end
).compact
inserted_record = nil
# If we have related records to save then create a transaction, otherwise skip it!
with_optional_transaction(autosave_association_method_calls.any?) do
autosave_association_method_calls.each { |autosave_association| record.public_send(autosave_association) }
# return any database generated values and timestamps!
returning_attributes =
(
columns.select do |column|
column.default_function.present? || %w[updated_at created_at].include?(column.name)
end
).map(&:name)
# wrap inserting with callback logic, throw :abort on error!
inserted_record =
catch(:abort) do
record.run_callbacks :save do
record.run_callbacks :create do
attributes_to_insert = record.attributes
current_time = Time.zone.now
if columns_hash.has_key?('created_at') && attributes_to_insert['created_at'].nil?
attributes_to_insert['created_at'] = current_time
end
if columns_hash.has_key?('updated_at') && attributes_to_insert['updated_at'].nil?
attributes_to_insert['updated_at'] = current_time
end
# Don't insert nil values with a DB default!
attributes_to_insert =
attributes_to_insert.select do |key, value|
!value.nil? || column_for_attribute(key).default_function.blank?
end
# Calling #relation provides us with a model that is free of create_with args!
inserted_attributes = relation.insert(attributes_to_insert, returning: returning_attributes).to_a.first
if inserted_attributes.present?
record.assign_attributes(inserted_attributes.compact)
record.changes_applied
record.instance_variable_set('@new_record', false)
record.instance_variable_set('@previously_new_record', true)
record
else
throw :abort
end
end
end
end
# Trigger after_commit on our model once the current transaction is complete!
connection.current_transaction.after_commit { record._run_commit_callbacks } if inserted_record.present?
end
inserted_record || find_by!(attributes)
end
As you can see: while it’s functional, reaching into model internals is less than ideal, it’d be great if this feature was native within active record!
Proposal
Add an options argument to the create and create! methods which sets on_duplicate like so:
def create!(attributes = nil, options = { on_duplicate: nil }, &block)
Models will need to provide some feedback that the insertion has failed which needs to differ from insert and other direct DB methods as those methods return a database result object which returns the rows inserted.
In the case of passing on_duplicate: :ignore the create! method could raise an exception and for create where the model is returned, there can be some feedback on the model itself that it did not insert. persisted? could return false for example, though it’d be useful to also have an error populated or a query method on the model (perhaps insert_ignored?) to know that we have ignored the insert.
Additionally the create_or_find_by! and related methods could also take an options argument like so:
def create_or_find_by!(attributes, options = { on_duplicate: nil }, &block)
which removes the need to create a subtransaction via transaction(requires_new: true) where supported by the database!
I would love to contribute, to add this new feature to active record. Or otherwise provide any mechanism to allow the use of ON CONFLICT together with the active record model lifecycle.
Would there be appetite for such a contribution?