Counter gem out there?

Hi,

My application is gradually being refined and it is becoming necessary
to provide some statistics. I don't really want to add all of these
into the main application logic since it is not critical. I would
like to "observe" a bunch of models and classes and increment and
decrement counters. So rather than have a bit of hideous AR finds to
show me the number of orders by month I can observe the AR callbacks
on event and increment "August 2011 Orders". I would also need to be
able to reset via a Rake task or something similar.

Has someone come across a Gem that provides this sort of functionality
before I go and write it?

O.

Owain wrote in post #1015933:

My application is gradually being refined and it is becoming necessary
to provide some statistics. I don't really want to add all of these
into the main application logic since it is not critical. I would
like to "observe" a bunch of models and classes and increment and
decrement counters. So rather than have a bit of hideous AR finds to
show me the number of orders by month I can observe the AR callbacks
on event and increment "August 2011 Orders". I would also need to be
able to reset via a Rake task or something similar.

Has someone come across a Gem that provides this sort of functionality
before I go and write it?

Unless there was a requirement for these statistics to be absolutely
up-to-date at all times, I don't think I would track them with every
change to the models being tracked. Instead, I would probably create a
background process to periodically update the counts and store them in a
separate "statistics" table. Essentially "mining" the statistical
information rather than tracking it directly. This also provides an
opportunity to flatten some of the relational data, which can
dramatically improve query performance for reporting purposes.

Consider the example you mentioned; It seems likely that users would be
interested in "August 2011 Orders" sometime in September, or later. It
would also be likely that they would want to compare August with the
results from August of the prior year.

Using data mining techniques these statistics could be provided
retroactively, and it would also be possible to update the statistics on
a completely separate process or even a separate server leaving your
primary application servers free to serve user requests.

There are several gems that provide the infrastructure to build
something like what I described. For example Github created their own
solution, which they have open sourced called Resque:

https://github.com/defunkt/resque

This would also provide you the rake task for resetting, which would
just enqueue another Resque job that you create.

+1 to what Robert suggested, *but* first I would define methods of the
models to provide the stats you are looking for (or define a new
statistics class that provides the methods) then initially just do
them as AR queries. You have said these will be 'hideous' but they
should not be difficult to code. This may apparently be an
inefficient way to achieve the answers but it will be simple and will
get you functioning (and testable) code quickly and easily. Then, in
the fullness of time, if it becomes clear that they take too long to
run, you can worry about optimising the execution by running data
mining techniques as Robert has suggested. As these will involve
refactoring of the methods you started with it will have minimal (if
any) effect on the rest of the app.

Colin

Consider the example you mentioned; It seems likely that users would be
interested in "August 2011 Orders" sometime in September, or later. It
would also be likely that they would want to compare August with the
results from August of the prior year.

Using data mining techniques these statistics could be provided
retroactively, and it would also be possible to update the statistics on
a completely separate process or even a separate server leaving your
primary application servers free to serve user requests.

Rob

This was my starting position, i.e. download a load of data into some
other tool (excel) and do the analysis there. I am looking for more
real-time counters that can be used for a dashboard to graph order
volumes and so forth so I really want it to be performant. Every time
the user goes to order statistics I do not want to have to do a huge
amount of DB access.

There are several gems that provide the infrastructure to build
something like what I described. For example Github created their own
solution, which they have open sourced called Resque:

https://github.com/defunkt/resque

This would also provide you the rake task for resetting, which would
just enqueue another Resque job that you create.

Resque is a background job queue, I currently use delayed_job for
sending emails. Unless I am mistaken I am not sure this is really
much help. I am not expecting to have to run the reset task that
often so running it from the command line will suffice.

Colin,

That was exactly my first thought but I think the statistics are
really just observing the application, not the application itself. So
rather than having a whole load of class methods on the models to
provide the data I can use an Observer class per model, or even better
one Observer class that collects the statistics for all of the
relevant models. So only the statistics Observer class ever changes
as the requirements for the statistics changes over time.

The idea came from the article http://railstips.org/blog/archives/2011/06/28/counters-everywhere/
by John Nunemaker (I recommend his posts for some great ideas). He
has coded this in a MongoDB environment, basically a key-value pair.

Here is some pseudo code not implemented,

class Amodel < ActiveRecord::Base
  # nothing here
end

class StatisticsObserver < ActiveRecord::Observer
  observe Amodel #can you register more than one class per Observer
or create a parent class for statistics?

  def after_create(obj)
    increment_counter(a)
  end

  def after_update(obj)
    update_counter(a)
  end
end

So the counter (i.e. the key) would simply be based upon the object
attributes concatenated with some sort of time derivative. So we
could track sunday orders above a certain value by month and hide all
of that logic in the Observer.

Then pulling the values of these counters should be easy for creating
charts, reports and so forth.

O.