Eager loading sort order

Sorry, just thought I'd do a repost because of the junk mail that my message got stuck in between. So, again.... I have tried to wade through the various tickets on this topic and have come away confused. Is the abandonment of an association's sort order during eager loading something we just 'have to live with'? I have a wee patch that I've concocted that honors an association's sort order when loaded eagerly. I'm just wondering : has this been attempted before? Should I save my breath?

/Brad.

Depends what you are talking about. for the new eager loading mechanism, this should already work. For the old eagerloading i suspect that the base case (eager load one association) would be fine, but once you start having multiple (possibly nested associations), all with sort orders it would start to get really really hairy.

Fred

I'm sorry, but I'm not sure what you mean by 'new' vs 'old' eager loading. The eager loading I mean took place when the :include option was given on a find. I looked at the source and it did not seem to apply ordering in this case.

Is there another way of doing eager loading? I would gladly change my code if there is a better way.

As for my solution, I will have to put it through it's paces for sure, but I believe my principals are sound. I just wanted to know if this would be of any value to a broader audience.

/Brad.

I'm sorry, but I'm not sure what you mean by 'new' vs 'old' eager loading. The eager loading I mean took place when the :include option was given on a find. I looked at the source and it did not seem to apply ordering in this case.

Edge rails has a new eager loading scheme (have a look in
association_preload.rb). :include still falls back to the old
implementation in some cases (there's a quick summary at Mixing :include and :conditions - Space Vatican   )

Fred

Thanks so much. That's brilliant! But I guess I need to be "on edge" to have this new eager loading mechanism, eh? Not long I guess.

Thanks so much. That's brilliant! But I guess I need to be "on edge" to have this new eager loading mechanism, eh? Not long I guess.

Yup, you have to be on edge. It would still be interesting to see what
you've done, since we fall back to the old code in some places.

Fred

Fred,

I thought you might be interested in this:

http://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/64

Regards, Trevor

Fred,

I thought you might be interested in this:

#64 eager loading inconsistency - Ruby on Rails - rails

Funny, I was thinking about has_one just this afternoon. It's a
toughie though - while fiddling with some of the internals can get you
the right comment, you're still loading all comments. Not sure off the
top of my head how you get around that (what does the old :include do ?)

Fred

Hey Fred,

"what does the old :include do?" - it ignores your :order so it's arbitrary as far as I'm concerned.

Ironically and by happy coincidence, using mysql and a has_one :order where you want the most recent record tends to actually give you the most recent record even though your :order clause is discarded. Yes, this means that fallback-eager is (probably) favoring the last record in the set.

"you're still loading all comments" - yeah, new-eager and fallback-eager both ask the database for rows that will be discarded.

What I'm bringing up in the ticket is that there are currently 3 different results, none of which are consistent with any others.

1 - Non eager is definitively correct. 2 - fallback-eager ignores :order so can't be trusted [*] 3 - new-eager does the opposite of #1 so is *never* correct

[*] sorta, if you are aware of the issues/limitations it might still work out for you

Both eager styles discard records so the impact is the same - it's just that new-eager is guaranteed to always be wrong.

So now that I've gotten that bit out of the way, you mentioned "still loading all comments".

I've spent a bit of time on the resultset trimming issue for has_one with :order and found:

If all databases supported a postgres-style "distinct on" operation the solution would be trivial. Of course, mysql doesn't have anything like this.

The only query-based approach to trimming the results that I've found to work reliably in mysql is to craft a rather heinous self-join:

select comments.* from comments left join comments_prune on comments.post_id = comments_prune.post_id and comments.created_at < comments_prune.created_at where comments_prune.id is null and comments.post_id in (1,2,3,4,5) order by comments.created_at desc

It would be possible to build such a query by parsing the :order option and adding appropriate 'and' join conditions but it gets quickly out of hand - imagine :order => 'foo asc, bar desc, wibble asc'. BLECH.

The most bulletproof solution when :order is present on a has_one is to just issue 2 queries: first fetch the id and foreign_key ordered appropriately, build a list of ids that discards non-first-for-foreign_key ids and then fetch again using your pruned id list.

But as I said, trimming the resultset isn't relevant to my ticket and I have to admit I'm not sure how much value would be generated by coding up trimming logic anyhow.

Regards, Trevor

Fred,

I thought you might be interested in this:

#64 eager loading inconsistency - Ruby on Rails - rails

Funny, I was thinking about has_one just this afternoon. It's a toughie though - while fiddling with some of the internals can get you the right comment, you're still loading all comments. Not sure off the top of my head how you get around that (what does the old :include do ?)

Fred

Hey Fred,

What I'm bringing up in the ticket is that there are currently 3 different results, none of which are consistent with any others.

1 - Non eager is definitively correct. 2 - fallback-eager ignores :order so can't be trusted [*] 3 - new-eager does the opposite of #1 so is *never* correct

[*] sorta, if you are aware of the issues/limitations it might still work out for you

Both eager styles discard records so the impact is the same - it's just that new-eager is guaranteed to always be wrong.

So now that I've gotten that bit out of the way, you mentioned "still loading all comments".

Yup, agree with you on all that. I was just wondering if we could be smart about that, in addition to resolving the issue on the ticket

I've spent a bit of time on the resultset trimming issue for has_one with :order and found:

If all databases supported a postgres-style "distinct on" operation the solution would be trivial. Of course, mysql doesn't have anything like this.

The only query-based approach to trimming the results that I've found to work reliably in mysql is to craft a rather heinous self-join:

select comments.* from comments left join comments_prune on comments.post_id = comments_prune.post_id and comments.created_at < comments_prune.created_at where comments_prune.id is null and comments.post_id in (1,2,3,4,5) order by comments.created_at desc

It would be possible to build such a query by parsing the :order option and adding appropriate 'and' join conditions but it gets quickly out of hand - imagine :order => 'foo asc, bar desc, wibble asc'. BLECH.

Yeah, ugh. If we grouped by post_id are we guaranteed anything sensible (I seem to remember that what the db gives for non agregate, not grouped on columns can't be relied upon to be anything specific other than one of the rows but I can't remember).

The most bulletproof solution when :order is present on a has_one is to just issue 2 queries: first fetch the id and foreign_key ordered appropriately, build a list of ids that discards non-first-for-foreign_key ids and then fetch again using your pruned id list.

But as I said, trimming the resultset isn't relevant to my ticket and I have to admit I'm not sure how much value would be generated by coding up trimming logic anyhow.

It would be nice to have in addition to what's in your ticket, it seems non trivial though.

Fred

The main hurdle I found with trying to use "group by" is that any "order by" in your query is applied *after* the grouping so you can't use "group by" to try and simulate "distinct on".

If anyone with some cross-database smarts wants to chime in with the ultimate solution I'd love to see it. It's a rare use-case but it seriously bugs me that I can't find a good, cheap, cross-database solution :stuck_out_tongue:

Trev

Yeah, ugh. If we grouped by post_id are we guaranteed anything sensible (I seem to remember that what the db gives for non agregate, not grouped on columns can't be relied upon to be anything specific other than one of the rows but I can't remember).

The main hurdle I found with trying to use "group by" is that any "order by" in your query is applied *after* the grouping so you can't use "group by" to try and simulate "distinct on".

Ah yes, of course. (the only way round that i know of is to order in a
subselect and then select from that and group, but then we're back in
the realm of fugliness).

Fred

OK. This all sounds quite overwhelming. But I decided to have a go at solving my own problem and seeing if it is of any value to anyone else. So I dove into associations.rb and ended up generating an 'order' chain in the JoinDependency when it is created. The way I did it, the order chain will only reference those associations that specifically had an 'order' clause given (as well as the primary keys of it's parents in order to maintain 'hierarchy'). I then use that chain later on in construct_finder_sql_for_association_limiting, inside a custom add_order routine I created. This routine basically first generates the standard ordering statements given in the find, and then adds association ordering on after (thereby giving precedence to the order statements from the call to find). It has served my purposes. Is there broader applicability? I am willing to share code, although it's rough around the edges since I haven't delved in here much. And it still needs to be tested. But it does seem to work for the rinky-dink multiple associations and 1-level nested associations I tried it on.

/Brad.