Duping Active Record objects in 3.0.x is broken in a couple of ways,
coming from unintended behaviour introduced in a commit that was only
meant to apply to #clone. Here is a ticket with an explanation and a
patch to fix it:
https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/5917
There is one thing I'm a bit iffy on with the patch, which is outlined
in the ticket, so feedback is needed.
Thanks,
James
Hi James,
I think we need to more concretely define what it means to "dup" or
"clone" an ActiveRecord object. After studying the documentation for
"dup" and "clone" in Ruby's core classes:
http://ruby-doc.org/ruby-1.9/classes/Object.html#M000215
http://ruby-doc.org/ruby-1.9/classes/Object.html#M000214
It seems to me that the main difference between the two is captured in
these sentences:
In general, clone and dup may have different semantics in descendent
classes. While clone is used to duplicate an object, including its
internal state, dup typically uses the class of the descendent object
to create the new instance.
I interpret this as _clone_'d objects should not be new objects, as they
don't do anything specific with the descendent objects. In contrast,
_dup_'d objects should be new, saveable, records.
In that case, we should set new_record to true on dup'd objects and
remove the primary key.
What do you think?
Hi Aaron, James,
I created a patch earlier this year that will prevent the timestamps to be persisted on clone’d objects. https://rails.lighthouseapp.com/projects/8994/tickets/4538-activerecordclone-does-not-clear-out-timestamps#ticket-4538-14
If I understand correctly that sentence:
*> I interpret this as clone’d objects should not be new objects, as they
don’t do anything specific with the descendent objects. In contrast,
dup’d objects should be new, saveable, records.*
Then should we should mark this ticket as invalid ?
Hi Aaron,
Thanks for looking at this. Yes, I saw the same thing in the ruby docs
and considered suggesting something similar. However, the docs for
clone also say that it copies the frozen state of the object. If we
really want to make it more ruby-like, we could make Active Record's
clone the "direct copy" version and make dup the "new record with same
attributes" version. But then we'd also want to change clone to
maintain the frozen state of a frozen record, which means there's no
way to "just unfreeze" a record (this happened to be our use case when
we ran into this issue).
James
Hmm.. I guess I don't understand. clone *should* copy the frozen
state. What's wrong with copying the frozen state? Why do you need to
unfreeze the record? "dup" will create an unfrozen copy, could you use
that?
Sorry if I'm asking redundant questions, but I'm totally late to the
party on this issue. :-/
No problem, I should have explained a bit more. The reason we wanted
to use dup to unfreeze records is that there was a change in
ActiveSupport::Cache which started freezing all objects as they are
read from the cache:
https://github.com/rails/rails/blob/710dcf826a073720b817/activesupport/lib/active_support/cache.rb#L578-580
So originally the problem was that we couldn't even modify the
attributes of the retrieved record because its attributes hash was
frozen. We started duping the records to unfreeze them, and then we
started getting the duplicate key errors when trying to save those
duped records.
Personally I don't think the cache should be freezing things like that
anyhow, but that's a separate issue. I think in general terms, it's
good to have a method around like dup which just gives you an unfrozen
copy of a record, which behaves pretty much in the same way as the
original.
When I wrote the patch, I was heavily prioritizing backwards
compatibility. I think if I were writing all this from scratch, I'd
keep both clone and dup really simple and very close to ruby's basic
semantics outlined in those docs (basically both would have the same
behaviour as I made dup have in my patch, but clone would preserve
frozen state). Then if you wanted to create a new record with the same
attributes but a nil primary key, you'd maybe just do
"bobs_identical_twin = Person.new(bob)".
P.S. I don't know why it keeps changing the discussion subject on my
behalf...
>
> Hmm.. I guess I don't understand. clone *should* copy the frozen
> state. What's wrong with copying the frozen state? Why do you need to
> unfreeze the record? "dup" will create an unfrozen copy, could you use
> that?
>
> Sorry if I'm asking redundant questions, but I'm totally late to the
> party on this issue. :-/
>
No problem, I should have explained a bit more. The reason we wanted
to use dup to unfreeze records is that there was a change in
ActiveSupport::Cache which started freezing all objects as they are
read from the cache:
https://github.com/rails/rails/blob/710dcf826a073720b817/activesupport/lib/active_support/cache.rb#L578-580
So originally the problem was that we couldn't even modify the
attributes of the retrieved record because its attributes hash was
frozen. We started duping the records to unfreeze them, and then we
started getting the duplicate key errors when trying to save those
duped records.
Personally I don't think the cache should be freezing things like that
anyhow, but that's a separate issue. I think in general terms, it's
good to have a method around like dup which just gives you an unfrozen
copy of a record, which behaves pretty much in the same way as the
original.
I see. Yes, I also think this is strange behavior for the cache. But
we'll deal with that later. 
When I wrote the patch, I was heavily prioritizing backwards
compatibility. I think if I were writing all this from scratch, I'd
keep both clone and dup really simple and very close to ruby's basic
semantics outlined in those docs (basically both would have the same
behaviour as I made dup have in my patch, but clone would preserve
frozen state). Then if you wanted to create a new record with the same
attributes but a nil primary key, you'd maybe just do
"bobs_identical_twin = Person.new(bob)".
It seems you were prioritizing for backwards compat with 2.3.x, right?
I find it confusing that AR clone / dup semantics don't follow Ruby's
semantics.
Can we apply your patch for 3.0.x, but make 3.1.x more closely follow
Ruby?
It seems you were prioritizing for backwards compat with 2.3.x, right?
Yes, and also backwards compatibility with the intent of the patch
that changed things, i.e. still allowing people to override
initialize_copy in the same way.
I find it confusing that AR clone / dup semantics don't follow Ruby's
semantics.
Yup! Me too 
Can we apply your patch for 3.0.x, but make 3.1.x more closely follow
Ruby?
I think that's a good idea.
It looks like that ticket would still be just as valid for 3.0.x. If
we go back to more of a back-to-basics approach with clone/dup in 3.1,
then the ticket would have to be changed to reflect whatever method
ends up taking on clone's current behaviour.
James
Yes. I am going to apply this to 3-0-stable, but not master. I would
like for clone / dup to mirror Ruby's behavior in future releases.