Has One and Computer Columns

Hey all,

I've been playing around with special relationships (specifying the
select, etc) on my models and trying to get Rails to return the
results of database computations as part of an association. I've got
one model representing an object and another model representing
statistics about that object on per day basis. That main Event object
has many Stats but it also has one Summary (of those Stats).
Unfortunately, Rails won't allow me to have a has one association with
computed columns (sums in my case) since it will not allow the use of
the group by clause without which Rails can't connect the returned
stats to the event since the event_id column is null. I've gotten
around this by using a has_many association for the stats even though
I'm grouping on the foreign key and will only ever have one record at
most. This creates some really "hacky-looking" code since I have to
read the first element of the returned array everywhere to read the
summary stats. I went digging through the Rails source to take a look
at why this might be prevented and I've come up with two thoughts:

1) Rails should allow use of :group for a has_one association for
specific use cases such as this one and let the programmer decide how
it works. The fix is as simple as adding it to the
valid_keys_for_has_one_association in associations.rb. However, this
lead me to a second thought...

2) In a has_one association it seems like it would never be
detrimental to use a group by clause on the foreign key as this would
always return one or fewer results for the parent table. This relates
to the comment at association_preload.rb:148 (in 2.3.3) which states
that there is no way to ask the database for one row per distinct
foo_id (which is what group by would do, at least in MySQL).

Am I missing something or is this something that ought to be patched?
Thanks for the input and help!

tony

Ooops, just realized my subject said "Computer" and not "Computed". I
guess no one else has run into this or seen it as a problem?

tony

Hey all,

I've been playing around with special relationships (specifying the
select, etc) on my models and trying to get Rails to return the
results of database computations as part of an association. I've got
one model representing an object and another model representing
statistics about that object on per day basis. That main Event object
has many Stats but it also has one Summary (of those Stats).
Unfortunately, Rails won't allow me to have a has one association with
computed columns (sums in my case) since it will not allow the use of
the group by clause without which Rails can't connect the returned
stats to the event since the event_id column is null. I've gotten
around this by using a has_many association for the stats even though
I'm grouping on the foreign key and will only ever have one record at
most. This creates some really "hacky-looking" code since I have to
read the first element of the returned array everywhere to read the
summary stats. I went digging through the Rails source to take a look
at why this might be prevented and I've come up with two thoughts:

Can you be a bit more specific on what associations you want to setup
on your models. Is Summary a model for example, as well as Event even
though they access the same table?

Colin

The "Summary" object you've described doesn't sound like a proper
ActiveRecord object - is it actually persisted to the database? It
sounds like you'd be better off just defining a non-AR model (maybe an
OpenStruct) for the summary. It's not going to help with preloading,
but you didn't state if that was an issue.

--Matt Jones

So I'd like to set up the below associations. An Event is a normal AR
model. Stat is a normal AR model. I want a special stat on Event
that is a DB computation (mostly just sums of each column) and a
summary of all the Stats for that Event. The Summary isn't persisted
and is marked as readonly. Unfortunately, preloading is important in
this case (and thus why I went off on the group_by tangent, sorry if
that wasn't clear).

I guess I could make a new model for Summary referencing the stats
table which uses a special select and always adds the group by clause,
however, that seemed like overkill. I really didn't want to run a
loop over the returned stats since the DB can give me exactly what I'm
looking for on the initial query.

Ok, here's a slightly abridged version of what I thought might work:

Event:
has_many :stats
has_one :summary_stat, :select => "sum(people) as
people, ...", :class_name => 'Stat', :foreign_key =>
'event_id', :readonly => true

Stat:
belongs_to :event

Adding "group" to the has_one solves the issue, but am I going about
this in the "right" way? Thanks for the input!

tony

I don't think I would go about it this way. Using Event has_one
summary_stat suggests that somewhere there is a model SummaryStat that
belongs_to event.
An alternative would be to define an appropriate named scope in Stat
that provides the summed stats using the single query and provide a
read only attribute of Event that calls it and provides the answer.

Colin

Thanks for the quick response! I'm not familiar with the named
scopes. Will these work with preloading the records (I assume by
using (:include => [:stats])? I could end up with a ton of Events on
one page and I'd prefer not to have to fetch the summary for every
one. I also assume I'd reference it by doing event.stats.summary?
That seems a lot cleaner. I'll go dig into this a bit. Thanks!

tony

Maybe I'm missing something but I don't see any way to get eager
loading with the named scopes. I have many Event models each with
many Stat models. If I have to query each event to get the summary, I
still have the n+1 query problem. Maybe the best solution is to
define a second model referencing the same table but with a different
default scope? That doesn't seem like it is the most concise solution
but it may be the cleanest and easiest to read.

I think the ideal solution would be the ability to reference a named
scope of a joined model. E.g:

Event:
has_one :summary, :class_name => 'Stat', :scope => :summary

Does that make sense? Is there some other way that I'm overlooking?
Thanks again!

tony

Thanks for the quick response! I'm not familiar with the named
scopes. Will these work with preloading the records (I assume by
using (:include => [:stats])? I could end up with a ton of Events on
one page and I'd prefer not to have to fetch the summary for every
one. I also assume I'd reference it by doing event.stats.summary?
That seems a lot cleaner. I'll go dig into this a bit. Thanks!

If you have a method of Event that fetches that particular event's
summary (by calling into Stat to perform the summing query) then it
would be event.summary, which seeems intuitively correct to me. I am
not sure what you mean by preloading, Are you trying to fetch all the
summaries for all events in one query? The approach I suggest should
do one query for each summary when you need that summary.

By the way the convention in this list is not to top post, so that
threads are more easily followed.

Colin

Maybe I'm missing something but I don't see any way to get eager
loading with the named scopes. I have many Event models each with
many Stat models. If I have to query each event to get the summary, I
still have the n+1 query problem.

Don't follow you here, it should be just one query for each event to
get the stat summary for that event.

Colin

Ooops, sorry about that. So it is indeed one query per event,
however, I need a list of N events all at the same time. I didn't
mean to imply that it is N queries per event, rather that it is N
queries per page load. The page is loading N events and their
summaries (kinda like a dashboard...). This means that I'd need a
query for every Event to fetch the summary, thus the N queries, + 1
for the initial fetching of the events. If I were to stuff this in an
association I'd have 2 queries only (one for the events and one for
all of the summaries). If I were only displaying a few Events per
page I wouldn't be concerned, however, 10, 20, 30+ queries for a
single page would begin to impact performance, I imagine. I'd really
like to replicate the performance of a find using :include.

If the only way to get that performance is to use a separate model
(SummaryStat) and mark it as readonly, I'm happy to do it. I was
simply hoping for a more concise and readable way of doing this. I
got very close to being able to do it in a concise way using the
normal has_* relationships, however, has_one doesn't support grouping
which I needed for this particular query.

Thanks again for bearing with me on this. I appreciate all your help.

tony

I think you are worrying too much about performance at this stage. Do
it whichever way seems most natural to you initially and worry about
performance later. My experience is that when performance problems
arise it is almost never the area that I expected to be the bottleneck
that is causing the problem and so to spend time worrying about it too
much at the start is a waste of time.
In fact the approach I suggest does not preclude using a single query
to get all the summaries for a set of events. Just make the class
method of Stat that I am suggesting takes a single event take an array
of or events instead, run the single query, and return an array of
stats.
As I said though I would suggest doing it the easy way first and worry
about additional complications if performance becomes an issue.
So I would still suggest
A class method of Stat, get_summary_for_event( event ) that runs the
summing query and returns the summary
An instance method of Event, summary that calls
Stat.get_summary_for_event( self ) and returns it.
It seems difficult to get a more concise and readable way of doing it than that.
However, as I said, I would suggest doing it whichever way seems most
natural to you, which is the 'best' way of doing things is often more
about how ones brain sees the problem than about some supposedly ideal
solution.

Colin

Colin,

Thanks again for sticking with me on this. I actually already have
this working and am currently cleaning up the code. Your point still
stands however, and I probably shouldn't worry about performance too
much. I just didn't want to degrade the performance at all from where
it stands currently. I'll play with your suggestion. Thanks for the
help!

As a sidenote: I think the ideal solution would be the ability to
specify scope (named or default) with eager loading. There's an open
ticket currently regarding this:
https://rails.lighthouseapp.com/projects/8994/tickets/2348-eager-loading-does-not-respect-default_scope.
Thanks again.

tony