Idea for handling dirty active_record serialized columns

Hello All,

Currently, serialized columns get save to the database whether they are changed or not. Github. Since the serialized columns can be modified inline, Dirty is not used. This behavior introduces too much database overhead for us, so I’d like to patch that code right out of there.

Having trouble extending ActiveRecord

ActiveRecord::Base.ancestors returns ActiveRecord::Core before ActiveRecord::AttributeMethods::Dirty, so the methods in Core take precedence over those in Dirty.

I was thinking Dirty overrides the behavior of Core and should have higher precedence. This would allow @changed_attribtues to move from Core to Dirty.

Is there history to this decision? Or is it even possible to change this? (I mean from both technically, as well as logistically changing something so core to the code.)

Changing the way Dirty is implemented.

I played around with storing the original values for an attribute, rather than just the values that have been changed. Github Only cloning/storing @original_values on attribute access keeps the code as close to the current @changed_attributes implementation. It not only works with serialized variables, but code like model.field.gsub!('a','b'), without requiring field_will_change!

It was more of a whim, but I wanted to know what people thought.

A more practical solution

I did a more traditional suggestion for a fix. Github But something in me is starting to like changing the way Dirty works, even thought it seems a bit much.

Any direction would be appreciated.

Thanks for all the hard work on rails,

kbrock

Cloning fails in exciting and (likely) unexpected ways when the serialized value is deeper than one level:

irb: hash = { a: [1,2,3], b: [4,5,6] }

===> {:a=>[1, 2, 3], :b=>[4, 5, 6]}

irb: hash2 = hash.clone

===> {:a=>[1, 2, 3], :b=>[4, 5, 6]}

irb: hash2[:a] << ‘baz’

===> [1, 2, 3, “baz”]

irb: hash

===> {:a=>[1, 2, 3, “baz”], :b=>[4, 5, 6]}

The only way I’ve found to reliably deep-clone objects is to chain together Marshal.dump and Marshal.load, but that’s almost got to be terrible for performance.

—Matt Jones