Excessive redundant object allocation in AR

That’s a big one, and it would be something that needs to be addressed in Ruby, not in Rails. But the problem is that you would have unintuitive behavior for those used to doing things like:

s = ‘Error’ s.chomp!(‘or’)

In today’s Ruby and jruby-1.7.0.preview2:

$ irb jruby-1.7.0.preview2 :001 > “Error”.object_id => 2042 jruby-1.7.0.preview2 :002 > “Error”.object_id => 2044 jruby-1.7.0.preview2 :003 > “Error”.chomp!(‘or’).object_id => 2046 jruby-1.7.0.preview2 :004 > s = “Error” => “Error” jruby-1.7.0.preview2 :005 > s.object_id => 2048 jruby-1.7.0.preview2 :006 > s.chomp!(‘or’) => “Err” jruby-1.7.0.preview2 :007 > s.object_id => 2048

See, when you are just working with strings willy nilly, it creates new instances and you don’t have to worry about things like the “bang” methods altering the same object.

In a StringPool’d ruby, the bang methods would need to return a string that was the same object_id so that past implementations that depend on object equivalence would still work, but it could not alter the “Error” string in the StringPool or things would go terribly wrong.

Feel free to take this up on the ruby list, and post back the link. I’m sure that those guys could figure out a way to make it work if they’ve not already discussed it, but my guess is it would be a breaking major change, even if it is necessary to reduce # of objects and make things faster.

It could be implemented in Rails by using a container class to hold the database field names that are used as the keys inside the AR @attributes hash, and reusing the same string object across instances. Those strings are frozen anyway so the concern about modification doesn’t apply. Based on the ObjectSpace data, that one change would have a large impact on the number of allocated subobjects for each AR model instance.

To be honest I think we should just change @attributes to be keyed by symbols. I don't see that there is a DoS vector in doing this since the keys aren't going to come from user input (however, I do need to think about that a bit more before I say so confidently).

I changed @attributes_cache to be keyed by symbols recently which lead to a nice speed up in attribute access (before then we were creating a new string every time you call an attribute method).

It should be noted that these things could theoretically be optimised at the implementation level. I did some benchmarking a while back and there was no difference between using symbols and strings in @attributes on JRuby. However on a practical level, I think we should change it.

I'm interested to hear what Mr T. Love thinks.

That would be even better, if it’s not too hard to change to symbols.