Feature Request: Delete associated records when model is saved

Jeremy_Mickelson · February 16, 2016, 3:51pm

I originally posted this to the Rails GitHub issue tracker (my apologies) here: Delete associated records when model is saved · Issue #23490 · rails/rails · GitHub. I’m now posting it in the correct place.

I am trying to mutate objects in memory without persisting them to the database (we always mutate the object, then we use it for a few business purposes, and then we only sometimes save the change, other times we reject them). Active record already supports modifying existing records in memory and adding new records in memory, which only get persisted to the database on save. However, deletions seem to always happen immediately. I believe this is the same idea that was put forth on this issue: #6994

The simplest example I can extract from my code would be something like this:

class CommunicationSetting < ActiveRecord::Base
  has_one :trigger, autosave: true
end

class Trigger < ActiveRecord::Base
  belongs_to :communication_setting, touch: true
end

c = CommunicationSetting.find(1)
c.trigger # => #<TriggerA:0x007faeba3dfa70 id: 1, communication_setting_id: 1>

# The following line runs a DELETE query on the database
c.trigger.delete # or destroy

# I want the delete query to run here, in case I decide to discard the changes
c.save!

I have also tried the mark_for_destruction method. That works exactly as I would like for the SQL queries. However, the object acts as if the value still exists, so it’s not in a consistent state.

c = CommunicationSetting.find(1)
c.trigger # => #<TriggerA:0x007faeba3dfa70 id: 1, communication_setting_id: 1>

# No DELETE is run here. Yay!
c.trigger.mark_for_destruction
# Unfortunately, the trigger still acts as if it is still here
c.trigger # => #<TriggerA:0x007faeba3dfa70 id: 1, communication_setting_id: 1>

# The DELETE query does run here like I want
c.save!

A similar occurrence happens on has_many associations, but it was simpler to demonstrate on the has_one because we don’t have to think about the array at the same time.

Another point of reference. This guy on Stack Overflow did a good job explaining the situation:http://stackoverflow.com/questions/11353582/delete-associated-records-when-model-is-saved

I can vouch that the behavior described occurs on Rails version 4.2.5, which is what I’m running in our project.

Nicholas_Firth-McCoy · February 17, 2016, 9:04pm

Could you run your code within a transaction and call the existing `destroy` method, and then rollback the transaction in the case that you don't want the deletion to persist?

Can you share some real world examples showing why you'd need to be able to soft delete the associated records? There might be other, better workarounds.

My guess is that this would be a complicated feature to add, but I'm not too familiar with the parts of ActiveRecord that this would touch.

Jeremy_Mickelson · February 18, 2016, 7:01pm

In our specific project we have an object called CommunicationSetting that defines an automated email that a client is setting up. That setting has many different child objects, Filters for example, which would exclude or include people from the recipient list. In this example we would like the client to be able to test how changes to their filters will affect the recipient list. So we would like to take their proposed changes (which might include additions, modification, and deletions), modify the objects in memory, get the list and return it to the UI so the user can review it. If the user likes the changes, they can hit save to persist the object to the database, or if they don’t like it the can abandon their changes and leave the objects in the database as it.

If deletions could be deferred until save time, than running these types of experiments become very trivial. The transaction workaround is plausible, but the whole point of having ActiveRecord objects in memory is the ability to modify them without persisting. Right now the behavior is inconsistent. Additions and modification to objects in a relation are performed in memory only, while deletions are immediately persisted to the database. I think that the inconsistency in behavior is the biggest problem.

If I had something like this:

c = CommunicationSetting.find(1)
the_filters = c.filters # => [#<Filter:0x007f9abe7c3408>, #<Filter:0x007f9abe7b2b30>]

Then I changed the_filters modifying one, removing one, and adding a new one, then executed

c.filters = the_filters

The modification and addition would be in memory only, while the deletion is persisted to the database immediately. This seems very inconsistent and counter intuitive.

Geoff_Harcourt · February 18, 2016, 7:15pm

One way you could handle this would be to add a virtual attribute to your model with attr_accessor called marked_for_deletion. You could then use that flag as a temporary change to your model without deleting it, and then delete those objects in the final DB transaction after the user approves the proposed changes. An advantage of this approach is that if your user abandons their approval that none of your changes have been persisted to the database (the virtual attribute is lost as soon as the model is no longer being referenced).

Another approach you could use would be the “soft delete”, where deleted models aren’t removed from the database, but are rather marked with a flag or a deleted_at timestamp. If you adopted that strategy, you would avoid having the records disappear from your database, and you could unwind the action fairly easily.

In a non-soft delete scenario, I think calling #delete or #destroy on a model and not having it be deleted immediately would be unexpected behavior.

Jeremy_Mickelson · February 18, 2016, 7:26pm

I agree complete that delete and destroy should remove objects from the database immediately. The situation that I’m talking about is not really related to the deletion of individual objects, its about how associations behave.

If I have a model with a has many, and I alter the array that represents that association, it’s not clear that some of those changes are in memory and some are immediately persisted to the database. I would expect the parent model to keep internal state representing either the objects to delete on save or a copy of the original array so it can diff at save time and detect the changes.

For example:


f = Filter.first

f.destroy # happens immediately

c = CommuncationSetting.first

the_filters = c.filters # contains 3 items

new_filters = the_filters.select { |f| /* only keep 2 of them */ }

c.filters = new_filters # right now active record deletes the remove filter immediately

c.save! # I propose that it should wait to delete the removed filter until here

Jeremy_Mickelson · July 7, 2016, 8:32pm

Any thoughts on this proposal?

Jason_FB · July 7, 2016, 9:24pm

It’s an interesting idea, but my instinct says it’s a huge change. We have spent 12 years learning that associated relations are mutated immediately (same behavior exists when you append to an association-- the db is touched immediately), and a change like this could be massively costly across the planet’s ROR installations.

Then again, I do find the black magic of AR sometimes difficult to reckon with, especially when you save objects and their related objects also need to get saved— how can you know which order saves & callbacks happen? (This is a HUGE problem in the Spree/Solidus codebases and has led to hundreds of thousands of hours of wasted productivity due to the black magic of AR)

So, while I support all of this being cleaned up and better documented, I think it’s kind of like a Rails 6 thing in my mind since it seems like a huge change.

-Jason

Jeremy_Mickelson · July 7, 2016, 10:29pm

I completely agree that it would a drastic change, very much not backwards compatible. So something like this should not be taken lightly. I just thought it was worth opening a dialogue because it seems to me that there is a big self inconsistency with the way AR deals with in memory objects. Attributes are modified only in memory until you call save, but associations are saved to the database immediately (there is no way to do an in memory change). That inconsistency cost our small company (3 developers) nearly a hundred hours of head scratching to get around (which is very expensive for a company of our size).

I propose that as part of Rails 6, we change the way association modifications are handle to have them only be in memory until save is called.

Jason_FB · July 7, 2016, 11:11pm

Yes, and then you’re gonna have to think too about which order your callbacks get called when you do call save… do the parent object callbacks get called first or do the child objects get called first? And what happens if one of the child before_save callbacks, or its validation, causes the object to be un-savable (in error state), do the rest of the related objects also not get saved?

It’s an issue.

I think an upgraded DSL (Domain Specific Language)-- which Active Record is wether it pretends it is not – would be significantly more robust, allow for these kinds of design flaws to be sussed out and at least become transparent to the developer, and move away from over-reliance on the callback pattern generally. A sophisticated ORM could could use composition objects (like in DCI patterns) themselves representing business logic that you would attach callbacks to in certain context, giving them clear patterns of what gets called before what (as in the case of object chains), and what failure cases.

It’s one of the things people talk about when they talk about things they don’t like about Active Record, you are absolutely right.

It seems to me that your proposal could be done with an experimental “Add later” or “Destroy later” feature on the relations themselves, perhaps configuring inline in the spot where you make the call, or perhaps configured on the relationship level. This feature could then defer the changes on the relation until the primary object is saved. But, as I discussed above, when you think about the pandora’s box of validation errors that this would open, you then understand why the existing Rails behavior eliminates these kind of flow problems by doing operations on relations immediately, as you correctly have identified.

-Jason

Jeremy_Mickelson · July 7, 2016, 11:28pm

How very interesting. The callback and validation logic that you bring up sounds like it could get complicated fast.

From the perspective of my use case, I would expect none of the records to save. I have a CommunicationSetting that is treated as one big object from the perspective of the app, but it’s storage is split across multiple AR Models and tables. So what I am doing is modifying a communication setting and its sub records, then calling .save! on the communication setting. If any of the validations failed I would want nothing to save and none of the after save callbacks to fire.

I can see how this could get complicated, but it seems like it could be manageable by looking at which object originally received the save call and treating the whole thing like one atomic operation. In fact isn’t this the way it currently works for modifications (as opposed to deletes), as long as you have autosave: true enabled on the relation. If I have a parent object and I modify both it and some of it’s child records, what happens if some of the validations fail? This is how I currently have my CommunicationSetting class set up and it works great, for everything except deletions.

I would be totally cool with an “add/destroy later” feature that I could enable on the relation because that would allow me to configure this meta record to work the way I want. If we did it that way it wouldn’t even be backward incompatible and wouldn’t have to wait for Rails 6.