Find eager loads correct records but generates excess queries (Rails 2.1)

Models:   S :has_many F; F :belongs_to P; S :has_many P :through F.

Query looks like this:   S.find_all_by_key([array of keys], :include => [:F, :P])

This produces the correct and expected result, but very slowly. On the log: SELECT * FROM S WHERE KEY IN (keys) SELECT * FROM F WHERE S_ID IN (S.ids from first query) SELECT * FROM F WHERE S_ID IN (S.ids from first query) [same query twice] SELECT * FROM P WHERE ID IN (ids from second query)

At this point all the required data has been retrieved and Rails is looking good. Time to quit. However: SELECT * FROM P WHERE ID = first id from second query SELECT * FROM P WHERE ID = second id from second query ... [typically about 50-200 queries, one query per P.ID]

Q1. Why so many extraneous queries? Q2. Is there an easy fix, or is this just too hard for Rails? Q3. In general, is it possible to control eager loading at a per-query level?

Models: S :has_many F; F :belongs_to P; S :has_many P :through F.

Query looks like this: S.find_all_by_key([array of keys], :include => [:F, :P])

This produces the correct and expected result, but very slowly. On the log: SELECT * FROM S WHERE KEY IN (keys) SELECT * FROM F WHERE S_ID IN (S.ids from first query) SELECT * FROM F WHERE S_ID IN (S.ids from first query) [same query twice] SELECT * FROM P WHERE ID IN (ids from second query)

At this point all the required data has been retrieved and Rails is looking good. Time to quit. However: SELECT * FROM P WHERE ID = first id from second query SELECT * FROM P WHERE ID = second id from second query ... [typically about 50-200 queries, one query per P.ID]

That's odd. is that from your eager loading or from your subsequent use of the data ( also anonymising the table names to S, F, P makes it rather hard to read than if there were meaningful names) (in particular eager loading is unidirectional - if person has_many posts, then Person.find(:all, :include => :posts) doesn't eager load post.person

Fred

All SQL queries are generated by that one, single find() call. This is really scary -- there is nothing in the documentation to suggest that it can do this kind of "fake join", or the performance impact it can have.

Sorry about the anonymisation, but the real table names would only confuse things worse.

I suspect this is a bug/feature of Rails eager loading, caused by the "triangular" relationships. S loads F using has_many, then F loads P using belongs_to, then S loads P again using has_many through. It "should" be smart enough not to, but that logic isn't in there. The documentation is absolutely no help, and the source code is way too complicated for this bear.

Q3 is the answer. I fixed the problem (with dramatic performance improvement) using :include {hash}, like so.   S.find_all_by_key([array of keys], :include => {:F => :P})

Magic!