fix ActiveRecord::Base#dup to be like it was in 2.x

Duping Active Record objects in 3.0.x is broken in a couple of ways, coming from unintended behaviour introduced in a commit that was only meant to apply to #clone. Here is a ticket with an explanation and a patch to fix it:

https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/5917

There is one thing I'm a bit iffy on with the patch, which is outlined in the ticket, so feedback is needed.

Thanks, James

Hi James,

I'll take a look.

Hi James,

I think we need to more concretely define what it means to "dup" or "clone" an ActiveRecord object. After studying the documentation for "dup" and "clone" in Ruby's core classes:

  http://ruby-doc.org/ruby-1.9/classes/Object.html#M000215   http://ruby-doc.org/ruby-1.9/classes/Object.html#M000214

It seems to me that the main difference between the two is captured in these sentences:

  In general, clone and dup may have different semantics in descendent   classes. While clone is used to duplicate an object, including its   internal state, dup typically uses the class of the descendent object   to create the new instance.

I interpret this as _clone_'d objects should not be new objects, as they don't do anything specific with the descendent objects. In contrast, _dup_'d objects should be new, saveable, records.

In that case, we should set new_record to true on dup'd objects and remove the primary key.

What do you think?

Hi Aaron, James,

I created a patch earlier this year that will prevent the timestamps to be persisted on clone’d objects. https://rails.lighthouseapp.com/projects/8994/tickets/4538-activerecordclone-does-not-clear-out-timestamps#ticket-4538-14

If I understand correctly that sentence: *> I interpret this as clone’d objects should not be new objects, as they

don’t do anything specific with the descendent objects. In contrast, dup’d objects should be new, saveable, records.*

Then should we should mark this ticket as invalid ?

Hi Aaron,

Thanks for looking at this. Yes, I saw the same thing in the ruby docs and considered suggesting something similar. However, the docs for clone also say that it copies the frozen state of the object. If we really want to make it more ruby-like, we could make Active Record's clone the "direct copy" version and make dup the "new record with same attributes" version. But then we'd also want to change clone to maintain the frozen state of a frozen record, which means there's no way to "just unfreeze" a record (this happened to be our use case when we ran into this issue).

James

Hmm.. I guess I don't understand. clone *should* copy the frozen state. What's wrong with copying the frozen state? Why do you need to unfreeze the record? "dup" will create an unfrozen copy, could you use that?

Sorry if I'm asking redundant questions, but I'm totally late to the party on this issue. :-/

No problem, I should have explained a bit more. The reason we wanted to use dup to unfreeze records is that there was a change in ActiveSupport::Cache which started freezing all objects as they are read from the cache:

https://github.com/rails/rails/blob/710dcf826a073720b817/activesupport/lib/active_support/cache.rb#L578-580

So originally the problem was that we couldn't even modify the attributes of the retrieved record because its attributes hash was frozen. We started duping the records to unfreeze them, and then we started getting the duplicate key errors when trying to save those duped records.

Personally I don't think the cache should be freezing things like that anyhow, but that's a separate issue. I think in general terms, it's good to have a method around like dup which just gives you an unfrozen copy of a record, which behaves pretty much in the same way as the original.

When I wrote the patch, I was heavily prioritizing backwards compatibility. I think if I were writing all this from scratch, I'd keep both clone and dup really simple and very close to ruby's basic semantics outlined in those docs (basically both would have the same behaviour as I made dup have in my patch, but clone would preserve frozen state). Then if you wanted to create a new record with the same attributes but a nil primary key, you'd maybe just do "bobs_identical_twin = Person.new(bob)".

P.S. I don't know why it keeps changing the discussion subject on my behalf...

> > Hmm.. I guess I don't understand. clone *should* copy the frozen > state. What's wrong with copying the frozen state? Why do you need to > unfreeze the record? "dup" will create an unfrozen copy, could you use > that? > > Sorry if I'm asking redundant questions, but I'm totally late to the > party on this issue. :-/ >

No problem, I should have explained a bit more. The reason we wanted to use dup to unfreeze records is that there was a change in ActiveSupport::Cache which started freezing all objects as they are read from the cache:

https://github.com/rails/rails/blob/710dcf826a073720b817/activesupport/lib/active_support/cache.rb#L578-580

So originally the problem was that we couldn't even modify the attributes of the retrieved record because its attributes hash was frozen. We started duping the records to unfreeze them, and then we started getting the duplicate key errors when trying to save those duped records.

Personally I don't think the cache should be freezing things like that anyhow, but that's a separate issue. I think in general terms, it's good to have a method around like dup which just gives you an unfrozen copy of a record, which behaves pretty much in the same way as the original.

I see. Yes, I also think this is strange behavior for the cache. But we'll deal with that later. :slight_smile:

When I wrote the patch, I was heavily prioritizing backwards compatibility. I think if I were writing all this from scratch, I'd keep both clone and dup really simple and very close to ruby's basic semantics outlined in those docs (basically both would have the same behaviour as I made dup have in my patch, but clone would preserve frozen state). Then if you wanted to create a new record with the same attributes but a nil primary key, you'd maybe just do "bobs_identical_twin = Person.new(bob)".

It seems you were prioritizing for backwards compat with 2.3.x, right? I find it confusing that AR clone / dup semantics don't follow Ruby's semantics.

Can we apply your patch for 3.0.x, but make 3.1.x more closely follow Ruby?

It seems you were prioritizing for backwards compat with 2.3.x, right?

Yes, and also backwards compatibility with the intent of the patch that changed things, i.e. still allowing people to override initialize_copy in the same way.

I find it confusing that AR clone / dup semantics don't follow Ruby's semantics.

Yup! Me too :slight_smile:

Can we apply your patch for 3.0.x, but make 3.1.x more closely follow Ruby?

I think that's a good idea.

It looks like that ticket would still be just as valid for 3.0.x. If we go back to more of a back-to-basics approach with clone/dup in 3.1, then the ticket would have to be changed to reflect whatever method ends up taking on clone's current behaviour.

James

Yes. I am going to apply this to 3-0-stable, but not master. I would like for clone / dup to mirror Ruby's behavior in future releases.