Encapsulate Relation#calculate into a subquery

I started rails issue #7121, but I think this needs some discussion as it’s not just a simple “bug”. I might take a stab at it myself, but I’d like to get a blessing or consensus on this:

Given a relation:

  1. When .group is added for conditions, this forces .count into returning a grouped count instead of a flat numeric count.

  2. When .select(“virtual_col”) and .order(“virtual_col”) are used, .count fails with a SQL error because the .select is rewritten for COUNT(*) but the .order is still there pointing to a non-existent column.

I found a solution for these by wrapping the relation in a subquery. This isolates the relation so calculate doesn’t see its group_values and select_values and can’t mess them up:

def self.wrapped

inner_scope = self.all # 3.2: use self.scoped

self.unscoped do

from(inner_scope.as(table_name))

end

end

Currently I use this by manually wrapping my relation before before calling .count on it: relation.wrapped.count

Is there any reason why Relation#calculate shouldn’t work this way by default? For grouped counts, here’s what I suggest:

  1. count() on a relation that uses group_values should just return a numeric count of records, not a hash.

  2. grouped counts must be done explicitly: .count(group: “column”) or .count_by(“column”) should return a hash.

Currently, finder methods on calculate are deprecated so using .count(group: “column”) logs a warning, but with the wrapped method it all works.

How to proceed?

For consistency, relation.count, relation.group(:assoc).count, and relation.to_a.count should all return Fixnums…

In #7121, I opened the discussion about separating count into a count_by(:column) method.

Currently there’s a deprecation on count(:group => :column), which advises to chain a group relation before calling count. I think this is the wrong approach. I suggest either:

  1. Deprecate count(:group => :col) in favour of a new count_by(:col)
  2. Accept count(group: col) as the only way to get grouped counts.

With the change I’m proposing, either option means any current relation.group(:col).count would change from returning a Hash to returning a Fixnum.

I’d like to look into fixing this when I have some time soon, If you have a preference for either of the above directions, let me know.

Andrew Vit

I’d like to get some traction or feedback on issue #7121 so I know if it’s worth me doing it. What are the objections to:

  1. Fixing brittle calculation methods via subqueries?
  2. Requiring count(:group=>:author_id) option explicitly when we want a Hash instead of Fixnum?

Or am I the only one who thinks this is broken? :frowning:

Andrew Vit

I posted my proposed changes to the Calculations API, and resulting SQL queries here:

https://gist.github.com/3681183#file_calculations_b.rb

I’m preferring the “calculations_b.rb” version over the one with a separate grouped_count method: it would be a less intrusive change, and when we need to pass a :group option anyway, it doesn’t make sense to repeat grouped_count(group:...) when count(group:...) is obvious.

Andrew Vit

There have been times when I would have liked to be able to somehow consistently get a fixnum count, instead of sometimes a fixnum and sometimes a hash that I just have to loop through and add up the values!

Dave

Did this ever get patched? I’m running into this issue currently and would love to know if it got merged anywhere.

No, I couldn’t really get any traction on this so I let it slide. The consensus on the Github issue was to leave the count inconsistency as-is. The last word from jonleighton was:

@avit coming back to this, I am feeling less confident that anything needs to be changed in AR.

We already have a way to create subquery via #from and you can then perform a count on the subquery if you wish. This to me seems sufficient to handle what is a rare case in most applications.

So if I understand correctly, the remaining objection is that #count might return a Fixnum or a Hash depending on the situation. I probably wouldn’t have designed it like that in the first place, but given it’s there I’m not sure there is a strong argument for changing it…

I still disagree and would’ve loved to work on a solution before Rails 4, but that’s where it’s at.

Andrew Vit