[Ruby on Rails #1642] HasOneThroughAssociation should not be a child of HasManyThroughAssociation

To put not too fine a point on it, this is a fundamentally improper
use of inheritance; the relationship exists to share implementation
rather than interface. Ruby provides modules for this exact
situation. Pulling the shared functionality into a shared module is
straightforward, and simplifies the code by removing special case
interface overrides.

This inheritance relationship has already caused its share of bugs, as
mentioned by commenters on the ticket. I fixed two here:
http://rails.lighthouseapp.com/projects/8994/tickets/895-has_one-through-errors.
More importantly, the added complexity created by importing all of the
collection logic and interface into a non-collection association class
just adds to rigidity and potential for odd bugs in the future.

What concerns me most about this is that resistance to cleaning up
code likely implies a lack of confidence in the test suite.
Considering how core associations are to Rails, and the not
insignificant amount of cruft in that code, there should be tests on
associations like fat kids on an ice cream truck. What are you
concerned the changes will break, given that all the tests pass?

More importantly, the added complexity created by importing all of the
collection logic and interface into a non-collection association class
just adds to rigidity and potential for odd bugs in the future.

Let's do this refactoring when that actually happens.

What concerns me most about this is that resistance to cleaning up
code likely implies a lack of confidence in the test suite.
Considering how core associations are to Rails, and the not
insignificant amount of cruft in that code, there should be tests on
associations like fat kids on an ice cream truck. What are you
concerned the changes will break, given that all the tests pass?

There is no lack of confidence. But I'm not very much in favour of
refactoring without any performance gains or any obscure bug fixes.
Refactoring like these makes almost all the relevant patches in LH
stale and screws up the history. I just don't think the patch in
question here is worth that.

Even if Adam patch make other patches stales, since it cleans up
current ActiveRecord associations code, it wouldn't be hard for others
users to redo their patch. IMHO, it would be even better for people
who are going to make patches in the future.

Rails core guys are obviously very familiar with Associations code in
general, but this is quite hard to catch up in the first time we see
it.

Since Rails core are "cleaning" the tickets in Lighthouse to make a
new release, I think that a great opportunity to apply Adam's patch is
after Rails 2.3 release.

I realize us Rubyists don't necessarily think the classic works in the field such as Fowler's "Refactoring" apply to our day-to-day coding in their entirety but in this case I fully agree with Fowler when he says "You don't decide to refactor, you refactor because you want to do something else, and refactoring helps you do that other thing".

I myself have had my faire share of frustration when browsing around in the AR associations code and as anyone who has ever done more than a few hours of programming knows confusion always – and I mean always! – leads not only to more bugs but also hinders in adding features to the confusing code.

I'd argue those "odd bugs"Adam mentioned have already been there. Many times. There is no *one* bug that shouts "I'm here because this code needs to be refactored!", it usually is much more subtle than this. Years of experience generally show though that the harder code is to understand, the more buggy it will be.

Now *if* there is fantastic test coverage and *if* someone is willing and able (regarding skills and free time) to give it a shot, I say go for it: Fork, refactor, push, and show us!

IMHO there's never a better time to refactor than right *now*. Besides yesterday of course..

Best,
Niels

Pratik wrote:

Given that the GSOC Rails projects involve some ActiveRecord
refactorings, doesn't this type of improvement make sense?

--dwf

Which problem is it solving for you ? Do you see any performance gains
that I'm missing ? Do you believe if this is making any code easier to
understand ? Could you please provide failing tests which are to be
fixed by the refactoring in question ?

Thanks.

I believe that Adam already addresses many of these points in his original post:

"This inheritance relationship has already caused its share of bugs, as
mentioned by commenters on the ticket. I fixed two here:
http://rails.lighthouseapp.com/projects/8994/tickets/895-has_one-through-errors.
More importantly, the added complexity created by importing all of the
collection logic and interface into a non-collection association class
just adds to rigidity and potential for odd bugs in the future."

It is hard to write a failing test for bugs that don't exist, but the
existence of prior already-fixed bugs should be considered as a
motivation for doing the refactor.

Furthermore, the existence of bugs isn't normally the primary
motivation for refactoring - but they are a symptom of a needed
refactoring. Complex, hard-to-understand code is a direct motivation
for refactoring, which I think is the point here.

-- Chad

Having just spent a lot of time wading through the HMT and HOT code,
I'll agree that there is a lot of room for improvement w.r.t.
clarity. And if there is a general agreement that a refactoring is
needed then the longer we wait the tougher the job will be. I would
encourage Adam's effort.

However, having already had discussions with several of you in the
past, I think we all know the problem doesn't stop with HMT and HOT -
it's the entire inheritance hierarchy and mixin structure of
associations. So Adam, are you prepared to expand the scope of your
refactoring to tackle the bigger job? Because I can see some downside
to a partial refactoring that leaves a code base with no track record
and few acolytes. Then finishing the job could actually be harder.

But overall, I think the clock is ticking on this one. Associations
code is already big and complex. Every accepted patch just adds to
the inertia of the current structure that I think everyone knows needs
to be re-thought.

--Chris

Yes, there are a lot of opportunities for refactoring in the
associations code. But, I have two problems with the all-or-nothing
approach. First, refactoring is *by definition* not all or nothing;
it is small, consistent improvements. Turning away an improvement
because it doesn't fix absolutely everything is a step back into
waterfall thinking. Second, I'm happy to continue refactoring
associations, and anything else I find that can use it. Well designed
code pleases me; poorly designed code irks me. But, I'm not inclined
to spend my free time working on refactorings only to see them
rejected because they're "just refactorings."

Yeah, aren't we supposed to be acolytes of Agile, and iterative
improvements?

"...aren't we supposed to be acolytes of Agile, and iterative
improvements?"
Yes. And I didn't mean to imply that a total refactoring of
associations was an absolute condition of "success" or that a "just
refactorings" effort does not have merit.

Roughly, I'm in complete agreement with Adam's original post:
Refactoring the associations code is needed, and the most needy areas
are the most fundamental (inheritance hierarchy, module-versus-
subclass composition, namespaces, etc.).. But I raise the flag that
this whole area is due for a major structural refactoring, not just
HOT/HMT. And, despite the theory, that should have implications for
anybody undertaking a refactoring of HOT/HMT.

-Chris

After the lively discussion at RailsConf regarding refactoring patches
for Rails I'd like to raise awareness once more of my patch to fix the
improper inheritance relationship between HMT and HOT association
classes. The ticket is here:
https://rails.lighthouseapp.com/projects/8994/tickets/1642-hasonethroughassociation-should-not-be-a-child-of-hasmanythroughassociation

For the click-averse among us, here is Pratik's most recent response
to the ticket:

If people with a lot of extra time are looking for some work that might actually add some real value, I'd be happy to help out. But I don't believe this "refactoring" is making the code any easier to follow.

Ignoring for the moment the unanimous support from commenters on the
ticket, here is how this "refactoring" improves the code and makes it
easier to understand:

1) Please take a look at activerecord/lib/active_record/
associations.rb, line 1236. This is the code that defies the
association setter method for all associations. On line 1243 the code
explicitly checks the type of the association_proxy_class and branches
to different logic if it's a HasOneThroughAssociation. For all other
association types the code makes a call to a polymorphic method
without regard to object type, as it should. Changing implementation
based on type interrogation in this way is a clear sign of code
fragility, makes the method longer and less obvious, and causes the
class abstraction to leak. My patch removes the explicit check and
reduces six lines of code from 1243 to 1249 to two. At the same time
it makes private the #create_through_record method in the
HasOneThroughAssociation class, simplifying the public interface to
the class.

From a more pragmatic point of view, this branching logic was a direct
cause of one of the bugs I fixed in
https://rails.lighthouseapp.com/projects/8994/tickets/895-has_one-through-errors
.

2) Please look at activerecord/lib/active_record/associations/
has_many_through_association.rb. This is a long file, 257 lines. My
patch pulls all of the Through-specific login into a module, which
reduces the size of this file to 112 lines, and makes an obvious
distinction between different aspects of the implementation. The
patch also reduced the size of the #find_target method from twelve
lines to two, delegating the scope construction to the shared
#construct_scope method, rather than building the scope inline.

3) Please look at activerecord/lib/active_record/associations/
has_one_through_association.rb. As mentioned before this class
exposes the #create_through_association method in the public
interface, though it performs implementation that should be handled by
the class within its abstraction. My patch makes this method
private. The class also includes special-case overrides of #find,
#find_target, and #reset_target! to mask the functionality inherited
from HasManyThroughAssociation (the override of #reset_target! was the
source of another bug I fixed in
https://rails.lighthouseapp.com/projects/8994/tickets/895-has_one-through-errors).
My patch removes the need for #find and #reset_target!, since the
inherited behavior from HasOneAssociation is already correct, and
changes the implementation of #find_target to mirror the
implementation of #find_target on the HasManyThroughAssociation class,
which is now a peer. In this case, it calls find(:first) on the
reflections class, using the scope generated by
ThroughAssociationScope#construct_scope. I believe this is more clear
than sending #find_target to the HasManyThroughAssociation superclass
with an explicit limit override.

4) This patch makes HasOneThroughAssociation a peer of
HasManyThroughAssociation in the inheritance hierarchy, as it properly
should be. HasOneAssociation is a peer of HasManyAssociation; doing
the same with their Through counterparts is far more obvious than
deepening the hierarchy only under HasManyAssociation. As well, the
current inheritance tree imports all of the code from
association_collection.rb and has_many_association.rb as well as
hash_many_through_association.rb into has_one_through_association.rb.
Moving HasOneThroughAssociation up the inheritance hierarchy to its
proper place removes a lot of inherited cruft that can only cause
complication (for instance, it made
https://rails.lighthouseapp.com/projects/8994/tickets/1643-association-proxies-should-correctly-respond-to-method-defined-via-method_missing
unnecessarily difficult) and confuse programmers who interrogate the
HasOneThroughAssociation class for its capabilities.

5) All of this is in addition to my argument that the current
implementation is a fundamental misuse of inheritance. This is not an
esoteric argument, the concept of encapsulation over inheritance for
sharing of implementation is well worn, baked into Liskov
Substitutability, permeating Design Patterns, advocated regularly by
the likes of Dave Thomas, etc. It's been proven enough times to
matter that it makes code easier to understand, easier to change, and
more resilient to defects. Ignore that at your peril, I suppose.

I'm interested in any and all detailed, technical concerns regarding
this patch. I've updated it to the current head of Rails master, with
no test failures. I've done searches in GitHub and found no plugins
that these changes might affect. And passive aggressive posturing
implying that I'm submitting useless fluff because I have too much
time on my hands isn't a particularly useful, or professional
response.

I'm in favor of your patch and have commented on the ticket.

Still, I think there biggest issues with ActiveRecord is something else:
It is basically gluing together bits and pieces of generated and
provided SQL without much concept of what it is doing.

Try anything interesting involving associations, scopes, and a few
joins. You either end up with too many joins between the same tables or
duplicate aliases the DBMS is going to complain about. What about those
naming conventions when the same table occurs multiple times?

ARec needs to take a step back from concrete SQL strings, toward
abstract models of the various SQL statements. This (huge) step would
allow for better modularity, customizability, and easier expression of
programmer intention. I have no idea how to introduce a change such as
this as it would surely void backward compatibility. Sigh, only
dreaming.

Michael

Sequel already does that, why not just use it if that's what you want?

Jeremy

Indeed, we've been talking about this for a long time. And Nick Kallen started the work required to do this with ActiveRelation. Emilio Tagua is doing a GSOC project this summer to integrate ARel into AR. Also, Nathan Sobo's Unison is a very interesting take on integrating a relational algebra into AR associations (which I think he started as an offshoot of ARel).

Inertia? I've had a quick look and Sequel is indeed appealing. I'll have
a closer look tomorrow.

As Josh points out in another reply, ARec is heading in a similar
direction. I'm wondering whether there's a chance to join forces.

Michael

[...]

> ARec needs to take a step back from concrete SQL strings, toward
> abstract models of the various SQL statements. This (huge) step
> would allow for better modularity, customizability, and easier
> expression of programmer intention. I have no idea how to introduce
> a change such as this as it would surely void backward
> compatibility. Sigh, only dreaming.

Indeed, we've been talking about this for a long time. And Nick
Kallen started the work required to do this with ActiveRelation.
Emilio Tagua is doing a GSOC project this summer to integrate ARel
into AR. Also, Nathan Sobo's Unison is a very interesting take on
integrating a relational algebra into AR associations (which I think
he started as an offshoot of ARel).

Interesting. If people have been talking about this for a long time, I'm
curious where this discussion has been happening. On this list I can
hardly find a trace of it.

Michael