Pull request: fix SQL generation bug in Postgres / Oracle

https://github.com/rails/rails/pull/4082

Can I get some eyes on this? It's a relatively trivial change, but
without it Oracle and Postgres generate invalid SQL in particular
cases.

TL;DR version: the construct_limited_ids_condition function passes the
order parameters from the relation to the adapter's 'distinct' method,
but it ignores values set with 'reorder'.

More detail:
On Postgres and Oracle (perhaps others), columns used in the ORDER BY
clause must appear in the SELECT DISTINCT clause or the SQL is
invalid. In the pull request's test, this code:

author.posts_with_comments_sorted_by_comment_id.where('comments.id >
0').reorder('posts.comments_count DESC', 'posts.taggings_count
DESC').last

generates the following SQL without the patch:

SELECT DISTINCT "posts".id, comments.id AS alias_0 FROM "posts" LEFT
OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" WHERE
"posts"."author_id" = 1 AND (comments.id > 0) ORDER BY
posts.comments_count ASC, posts.taggings_count ASC LIMIT 1

Note that generated SELECT clause references the wrong order, while
the ORDER BY clause is correct.

On Oracle, the situation is even worse - there, the generated SQL
actually references the aliases from the SELECT and either silently
fails (sorting on the wrong thing) for a single column in the reorder,
or blows up with an error if the count of terms in 'order' and
'reorder' differ.

The patch fixes the issue by using whatever's been passed to reorder
to build that SQL - a pretty minor change.

On a related note, the current mechanism for handling 'reorder' seems
pretty strange; what's the design motivation behind keeping the values
passed to reorder separate from order? As a side-effect of that
implementation, this code:

SomeModel.order('foo ASC').reorder('bar DESC').order('baz ASC')

does not (as might be expected) sort the returned record by baz - the
reorder overrides all previous *and* subsequent orders.

Thanks,

--Matt Jones

https://github.com/rails/rails/pull/4082

I've commented on the PR.

On a related note, the current mechanism for handling 'reorder' seems
pretty strange; what's the design motivation behind keeping the values
passed to reorder separate from order?

The reason is so that we can handle eagerly defined scopes correctly. For example, consider:

class Post < AR::Base
   scope :by_bla, reorder('bla')
end

At the time the scope is defined, it is effectively:

class Post < AR::Base
   def self.by_bla
     Post.reorder('bla')
   end
end

If we 'collapsed' the order values at this point, when we later do, e.g.

Post.order('foo').by_bla

We are effectively doing:

Post.order('foo').merge Post.reorder('bla')
=> Post.order('foo').merge Post.scoped
=> Post.order('foo')

The 'foo' order would not be replaced because at the execution time we don't have the information to know that 'by_bla' is supposed to trigger a reordering.

The root problem is that we allow eagerly evaluated scopes to be defined at all. Personally I dislike 'scope' and would rather that we just told people to define their own class methods (which of course only get evaluated when they are actually used). Others like the brevity of the single line 'scope' syntax.

For Rails 4 I would like to discuss making use of the new lambda syntax as a compromise, e.g.

scope :by_bla, -> { reorder('bla') }

As a side-effect of that
implementation, this code:

SomeModel.order('foo ASC').reorder('bar DESC').order('baz ASC')

does not (as might be expected) sort the returned record by baz - the
reorder overrides all previous *and* subsequent orders.

I think this is a bug.

https://github.com/rails/rails/pull/4082

I've commented on the PR.

Updated PR against master: https://github.com/rails/rails/pull/4216

On a related note, the current mechanism for handling 'reorder' seems
pretty strange; what's the design motivation behind keeping the values
passed to reorder separate from order?

The reason is so that we can handle eagerly defined scopes correctly. For example, consider:

That makes sense. I was pretty sure there was a good reason. :slight_smile:

The root problem is that we allow eagerly evaluated scopes to be defined at all. Personally I dislike 'scope' and would rather that we just told people to define their own class methods (which of course only get evaluated when they are actually used). Others like the brevity of the single line 'scope' syntax.

IMO, the real issue with order/reorder here is more about encapsulation - currently, anyone who's digging in the Relation internals needs to know what to do with order_values vs. reorder_value. It doesn't actually happen everywhere - for instance, in find_one:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/finder_methods.rb#L321

the giant conditional skips checking reorder_value. Probably not a major issue (especially since it can only matter if the identity map is on), but still annoying...

For Rails 4 I would like to discuss making use of the new lambda syntax as a compromise, e.g.

scope :by_bla, -> { reorder('bla') }

Seems sensible to me. I've also encountered a related issue with merging scopes; in short, merging a scope defined with a lambda has to *also* be wrapped in a lambda or else things don't work as intended. For instance:

scope :recent, lambda { where('created_at >= ?', 5.days.ago }
scope :recent_sorted, merge(recent).order('some_field ASC')

winds up prematurely evaluating the recent scope. Not a huge deal when the scopes are defined close to each other, but a recipe for head-scratching when they're farther apart (or in different modules, etc).

As a side-effect of that
implementation, this code:

SomeModel.order('foo ASC').reorder('bar DESC').order('baz ASC')

does not (as might be expected) sort the returned record by baz - the
reorder overrides all previous *and* subsequent orders.

I think this is a bug.

I'll write up a test and a ticket to get that discussion started.

Thanks,

--Matt Jones

Here's a pull request to correct this behavior:

https://github.com/rails/rails/pull/4282

This also solves the original problem in a slightly different way - a relation's order_values attribute is now always the ordering the final relation will use.

--Matt Jones

PS:
I also noticed a tangentially related issue - reverse_order has a similar sort of issue with late-application; for instance, this:

SomeModel.order('foo ASC').reverse_order.order('bar ASC')

will wind up with an order of 'foo DESC, bar DESC' since the reverse isn't handled until build_arel is called...