Possible bug with chained where methods on 3.0.4

Colin_Law1 · February 16, 2011, 9:52am

HI

I notice on Rails 3.0.4 that the following two statements generate different results.

State.where(:abbreviation => 'TX').where(:abbreviation => 'NE') SQL: SELECT `states`.* FROM `states` WHERE (`states`.`abbreviation` = 'TX') OR (`states`.`abbreviation` = 'NE')

State.where(:abbreviation => 'TX').where("abbreviation = 'NE'") SQL: SELECT `states`.* FROM `states` WHERE (`states`.`abbreviation` = 'TX') AND (`states`.`abbreviation` = 'NE')

Using Rails 3.0.3 they both generated AND clauses which seems to me to be correct.

Is this a bug or by design please?

Colin

Ernie_Miller2 · February 16, 2011, 11:21am

The OR functionality is new, but intentional. See Relation::QueryMethods#collapse_wheres. Since the idea of having a column being equal to two different values doesn't make sense, equalities with the same left-hand side get combined with OR, which is more likely the desired result if two scopes are merged and both include equality conditions on the same column.

Since your example above used a string in the second where call, it didn't result in an Arel::Nodes::Equality, so it wasn't ORed.

Robert_Pankowecki · February 16, 2011, 11:35am

I doubt that. There are a lot of cases (especially when using search/filter functionality of application) where you want it to generate an empty set because there are no matching records. Everybody everywhere in current code assumes that the condition will be added with 'AND' operator and now we are breaking this assumption ?

If I want to have 'OR' then I would build the query this way:

State.where(:abbreviation => ['TX', 'NE']) or using longer but perfectly fine and valid arel version...

An example:

def where_something(scope, value) scope.where(:column => value) end

This code is ambiguous now. I do not know what sql it is going to generate. Will it be "AND column = value" or "(... OR column = value)" ?

Please revert it to the original behavior and provide a different way of achieving this like:

State.possible(:abbreviation => 'TX').possible(:abbreviation => 'NE') => sql: "states.abbrevation = 'TX' OR states.abbrevation = 'NE"

or whatever different method name that reveals the intentions clearly

Robert Pankowecki

Will_Bryant · February 16, 2011, 11:37am

Even on just that point alone, this is a really bad API idea - it is completely inconsistent, and this will cause subtle and confusing bugs when people change between two otherwise-identical forms of where() argument.

And to do it at all breaks the relational algebra idea badly. If I pass a scope to something, and it does several different queries on it using its own where scopes, each of those scopes should further restrict the original scope, not broaden it.

With this commit, now neither callers nor callees have any idea what where(:hash => arguments) will result in - whether the returned results will match the conditions hash - unless they have constructed the entire scope from scratch.

I also struggle to understand how it was decided that a minor point patch release - for a security issue no less - should include an API behavior change that will cause preexisting code to start getting completely different query results.

Colin_Law1 · February 16, 2011, 11:39am

OK, thanks for that. If that is the way it is supposed to be then that is the way it is supposed to be. It seems odd to me that chaining a second where *adds* to the dataset of the first where. It would seem particularly unexpected in the case items = Model.where( :x => y) ... ... myitems = items.where(:x => z) resulting in myitems having more records than items, and in particular records that do not satisfy :x => z. Note that at this point the code would not necessarily 'know' the details of the first scope.

Colin

Ernie_Miller2 · February 16, 2011, 12:24pm

Sound arguments. I didn't implement collapse_wheres, just in case a witch hunt starts. I was just the first person awake this morning to pose the answer.

The change probably shouldn't have made it into a point release, but I do find a certain convenience and logic to it, and I long for a day when we aren't hacking about trying to make a String and a Hash act like they're the same thing. Hashes aren't strings, and (IMHO, and I know there are people on the core team who disagree) it would be better if we gradually eliminated hand coded SQL string conditions and relied more heavily on hashes and other data structures, converting them to SQL when needed via ARel. They're already a last resort in my own code and I expect them to be relatively brittle when I use them.

Ernie_Miller2 · February 16, 2011, 12:40pm

I should also probably note that I went off on a sort of anti-String tangent there, unrelated to the topic at hand. So please read:

"...a certain convenience and logic to it, and I long for..."

as:

"...a certain convenience and logic to it. On another note, I long for..."

Thanks.

Brian_Morearty · February 16, 2011, 1:35pm

Sound arguments. I didn't implement collapse_wheres, just in case a witch hunt starts. I was just the first person awake this morning to pose the answer.

Understood.

With this commit, now neither callers nor callees have any idea what where(:hash => arguments) will result in - whether the returned results will match the conditions hash - unless they have constructed the entire scope from scratch.

I agree with this reasoning, too. Under the 3.0.4 logic, we won't be able to chain independent scopes together on the fly with any confidence. E.g., a loop that iterates over various input conditions and adds scopes one by one based on the input.

+1 in favor of reverting to the original behavior: chaining 'where' clauses should consistently use 'AND'.

Brian

Chris2 · February 16, 2011, 1:40pm

+1 for the the original behavior where chained clauses are ANDed (POLS).

Aaron_Patterson1 · February 16, 2011, 4:59pm

The intentional change fixes this bug:

#4598 default_scope treats hashes and relations inconsistently when overwriting - Ruby on Rails - rails

If there is a different way to fix that ticket, I'm happy to apply patches.

Robert_Pankowecki · February 16, 2011, 5:10pm

The different way is to make both cases return an empty array. David (who created the ticket) expects something different but I am not sure that this is what most people expect.

Could we provide a different method for such behavior as expected by David ?

Robert Pankowecki

JangoSteve1 · February 16, 2011, 5:23pm

The intentional change fixes this bug:

https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4598-default_scope-treats-hashes-and-relations-inconsistently-when-overwriting

If there is a different way to fix that ticket, I’m happy to apply patches.

The different way is to make both cases return an empty array. David (who created the ticket) expects something different but I am not sure that this is what most people expect.

Could we provide a different method for such behavior as expected by David ?

Agreed, the OR behavior seems inconsistent and unintuitive. In fact, the original solution suggested was to AND them together to make case 1 consistent with case 2 (rather than ORing them to make case 2 consistent with case 1).

https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4598-default_scope-treats-hashes-and-relations-inconsistently-when-overwriting#ticket-4598-2

That seems more sane and less radical than having a single special case where chained methods get ORd. Especially when Arel itself already has its very own #or method to accomplish this within the relation chaining.

It seems odd to me that chaining a second where adds to the dataset of the first where.

Exactly.

-Steve @jangosteve

James_B_Byrne · February 16, 2011, 5:52pm

Perhaps:

#where_in == OR

#where == AND

???

Jason_King · February 16, 2011, 6:22pm

Just want to get this onto the table, “what most people expect” is not the only measure of a good API choice.

To specifically answer Aaron’s question, for #4598, I think the fix for that would be for subsequent default_scopes to just overwrite previous ones entirely. Not to merge with them.

For the broader question which seems to be what this thread is about, ie. what to do with: User.where(:foo => 1).where(:foo => 2) how about adding alias and as an alias for where, and adding or as an alias for “where that does ORs”. So the syntax could be:

User.where(:foo => 1).and(:foo => 2)

And:

User.where(:foo => 1).or(:foo => 2)

I’m sure I don’t need to state what SQL each of those turn into (the real test for a good API

Jason_King · February 16, 2011, 6:27pm

Btw, that could also resolve the "how to combine default_scopes issue, if that’s something that is still desired.

default_scope where( :foo => 1 )

And then:

default_scope where( :foo => 2 ) # effectively negates previous default scope

default_scope and( :foo => 2 ) # same as above

default_scope or( :foo => 2 ) # merges with original

All nice and explicit.

AND:

User.or( :foo => 2 )

…would also combine with any existing default_scope, whereas:

User.where( :foo => 2 )

Would AND it and therefore negate the default scope.

Ernie_Miller2 · February 16, 2011, 6:29pm

Pretty sure the reason default scopes are cumulative is to support STI classes refining their parent classes default scope, but I could be wrong. Outside of that context, I can’t think of a use case for calling it multiple times, off the top of my head. Happy to be corrected, though.

Matt_Jones · February 16, 2011, 6:59pm

I suspect the 'and' behavior is already pretty well understood as the result of chaining 'where's. What about an explicit one for the 'or' case:

User.where(:foo => 1).or_where(:foo => 2)

The only issue I could see with this syntax is that it's somewhat unclear what should happen here:

User.where(:foo => 1).where(:bar => 2).or_where(:bar => 3)

as it could be interpreted as either

Version A: (foo = 1 AND bar = 2) OR (bar = 2)

or

Version B: (foo = 1) AND (bar = 1 OR bar = 2)

The issue is especially tricky once you're constructing relations in more than one place. For instance, this should do something reasonable:

r = User.where(:foo => 1) def foo(some_relation) some_relation.where(:bar => 1).or_where(:bar => 2) # probably expects version B SQL end foo(r) # probably expects version B SQL

but so should this: r2 = User.where(:foo => 1).where(:bar => 1) def other_foo(some_relation) some_relation.or_where(:bar => 2) end other_foo(r2) # probably expects version A SQL

the difficulty is that the sequence of method calls on the relation are identical, they're just in different scopes.

On the other hand, isn't this sort of issue exactly why Arel has operators to combine scopes? The chained notation is alright for some things, but it's never going to be sufficient to create arbitrary SQL without a lot of fussing.

--Matt Jones

Jason_King · February 16, 2011, 8:27pm

Perhaps or() and where/and (I mainly add and because it’s symmetrical with or and adds a lot of readability IMHO) could take other where() snippets? So your examples become:

r = User.where( :foo => 1 )

r = r.and( r.where( :bar => 1 ).or( :bar => 2 ))

…and…

r = User.where( :foo => 1 ).where( :bar => 1 )

r = r.or( :bar => 2 )

…respectively.

Jason_King · February 16, 2011, 8:50pm

I get the feeling that those in the know are just hanging back waiting for us to tie ourselves in syntactic knots.

I don’t want to derail the original thread here, which is dealing with the much more common and simple case of providing a consistent API for when to add an AND and when to combine same-name parameters.

If we postpone any thoughts of creating complex nested WHERE clauses, and in effect (probably using the IN function) limit our parenthesizing to same-name columns, then I think there’s still a lot to be gained by making the OR explicit.

Then for non-same-name columns, we just leave it up the programmer and the call order (ie. no other parenthesis).

Jeremy_Evans · February 16, 2011, 9:16pm

Having where behave differently depending on whether the column was already used elsewhere in the relation is pure madness. If you do:

relation.where(:column => 1).all

All records returned by it should have 1 as the value of column, regardless of the contents of relation. Using where should filter the relation, it should always return a subset of the receiver.

FWIW, in Sequel, #where always operates as a filter (uses AND), and you can call #or if you want to use OR.

Jeremy

Topic		Replies	Views
3.0.4 change to chained where methods. Documentation? rubyonrails-talk	6	138	February 27, 2011
SQL OR in Rails rubyonrails-talk	6	107	October 21, 2008
Scopes with OR and AND optional chainity discussion rubyonrails-core	7	189	June 22, 2012
[ActiveRecord] `where.or` query method rubyonrails-core feature	6	3099	October 16, 2019
Access to more Arel predicate types from where condition hash rubyonrails-core patch	2	338	April 18, 2010

Possible bug with chained where methods on 3.0.4

Related topics

More Resources