Gabe,
This is a lengthy response, sorry to those not interested
I guess the issue I have with your destructive patch is that it hobbles existing behavior which (for me) returns a predictable dataset.
I’ve always known that :conditions which relate to :included associations can cause only part of the association’s data to be loaded (I think that’s been referred to as the mephisto bug). When I’ve used it in the past it’s been because that’s exactly what I want.
Setting aside the performance issues (which you address in your non-destructive patch), the problem as I see it is that there is no way to tell find() that you want certain conditions applied in the prequery and a different set of conditions applied in the data-loading query (which I’ll call ‘realquery’ from now on).
For the times when I have wanted all of the association’s data yet wanted to limit the top-level selection based on association conditions I’ve done my own prequery (with a limit) to grab the id values I want to present. I wrote a plugin that makes these prequeries a little easier to generate - http://svn.protocool.com/public/plugins/values_of
give me a list of the first 10 live album ids that have errored songs
ids = Album.values_of(“distinct albums.id”,
:joins => ‘inner join songs on songs.album_id = albums.id’,
:conditions => ‘albums.deleted_at is null and songs.errored = 1’, :limit => 10) {|string| string.to_i}
present those albums and all of their songs/artists
Album.find(ids, :include => [:songs, :artists])
That allows me to have 2 sets of :conditions - one for the prequery and one for the realquery - and guarantees that my constraints are always true for my returned data.
So… what I’m thinking is this: what about adding a new option to find() that allows you to specify ‘prequery-only’ conditions? For example, something like a :limit_conditions option that utilized your non-destructive patch?
-
The :limit_conditions option would only be valid in conjunction with :limit.
-
Both :limit_conditions and :conditions options would be applicable to the prequery.
-
Your non-destructive patch would determine the minimum joins required in the prequery based on :limit_conditions and :conditions
-
The :conditions would only be applied to the realquery
Those rules for which conditions apply and when are to make sure that the constraints you specify in your find() is guaranteed ‘true’ for all of the data you get back:
live albums with errored songs, and only those errored songs
as in, current behaviour with your non-destructive patch
Album.find(:all, :limit => 10, :include => [:songs, :artists],
:conditions => ‘albums.deleted_at is null and songs.errored = 1’)
prequery joins songs and applies condition ‘albums.deleted_at is null and songs.errored = 1’
realquery joins songs, artistst and applies condition ‘albums.deleted_at is null and songs.errored = 1’
live albums with errored songs, but I want all songs
Album.find(:all, :limit => 10, :include => [:songs, :artists],
:limit_conditions => ‘albums.deleted_at is null and songs.errored = 1’)
prequery joins songs and applies condition ‘albums.deleted_at is null and songs.errored = 1’
realquery joins songs, artists but applies no conditions
Now, this last example is probably where users might get confused about the interaction between :limit_conditions and :conditions - but the interaction is required in order to guarantee that you get back exactly what you specified in your single call to find().
live albums with errored songs and artists named dave, all songs but the artists relation only has artists called ‘Dave’
Album.find(:all, :limit => 10, :include => [:songs, :artists],
:limit_conditions => ‘albums.deleted_at is null and songs.errored = 1’,
:conditions => [‘artists.name = ?’, “Dave”)
prequery joins songs,artists applies condition ‘albums.deleted_at is null and songs.errored = 1 and artists.name = ‘Dave’’
realquery joins songs,artists and applies condition ‘artists.name = ‘Dave’’
It’s not ideal, and probably a little difficult for newbies to understand, but at least it’s consistent.
Ultimately I have to admit that when performance is a concern or the relations are hairy I will probably continue to reach for values_of() and another (soon to be released) plugin that hydrates associations with subsequent queries rather than with big-honking-joins.
Sorry for rambling, I hope I’ve not strayed too far from the original reasons you started talking about your problems…
Trev