How to count a group of data

11155 · November 9, 2010, 12:26pm

Hello everyone,

Let's say I have the following set of data in my database table:

Cars table

murali_dhararao · November 9, 2010, 12:34pm

Hi,

Why the data is redundant, ids there any relation behind that.

Mercedes CL63 why in two times is there any factor to distinguish in between than type and Brand?

11155 · November 9, 2010, 1:37pm

John N. wrote in post #960298:

Hello everyone,

Let's say I have the following set of data in my database table:

Cars table

------------------------------------- [Brand] [Type] Mercedes CL63 BMW M3 Mercedes CL63 Audi RS4 BMW M3 BMW M3 -------------------------------------

I would like to know how many cartypes I have of a particular brand. So in this case I would expect an output something like:

2x Mercedes CL63 3x BMW M3 1x Audi RS4

How can I generate such an output? I have really no clue how to start.

You would do this by using SQL aggregate functions (in this case, COUNT(*) ). If you don't know about those, then your friendly local SQL reference is your first stop.

Once you know about aggregate functions, look at the ActiveRecord::Calculations module (at least in Rails 2; it might have got moved in Rails 3), which abstracts these functions and provides a nice AR interface.

Suggestions, tips that can pull me in the right direction are welcome.

My usual tip: learn SQL. You may not have to write much of it when using Rails, but you need to know how SQL databases basically work. AR exists to automate and abstract, not to prevent you from having to learn SQL.

Thanks a bunch.

John

Best,

11155 · November 9, 2010, 2:48pm

Murali Tirupati wrote in post #960320:

Hi,

Why the data is redundant, ids there any relation behind that.

Mercedes CL63 why in two times is there any factor to distinguish in between than type and Brand?

Good catch. I was too focused on the proximate question.

More advice for the OP, then: your DB contains repetition. That's virtually always a bad thing. The process of removing this repetition is called *normalization*. You should always (except for a few special cases) normalize your schema to Third Normal Form (3NF) at a bare minimum, and usually to 5NF.

Yes, I know I'm throwing around jargon. Read about DB normalization (the Wikipedia articles are excellent) and apply it religiously.

Best,

Michael_Pavling · November 9, 2010, 6:26pm

Mercedes CL63 why in two times is there any factor to distinguish in between than type and Brand?

Good catch. I was too focused on the proximate question.

I'd assume (and happy to be proved wrong if the OP wants to confirm) that the "cars" table has other fields that identify different instances of physical cars (say, for a vehicle dealership tracking its sales), and there are fields other than *just* make and model in the row (such as colour, engine size, fuel, etc)

How can I generate such an output? I have really no clue how to start.

You would do this by using SQL aggregate functions (in this case, COUNT(*) ). If you don't know about those, then your friendly local SQL reference is your first stop.

In addition to using the SQL functions, Ruby and Rails both include functions for grouping (and sort, if you wish) enumerable objects. Have a look at: http://api.rubyonrails.org/classes/Enumerable.html#method-i-group_by http://ruby-doc.org/core/classes/Enumerable.html

The choice as to whether to do the same thing in code or in the DB is down to you, and your evaluation of the efficiency/benefit one may give you over the other (personal preference comes into it a lot too!

11155 · November 9, 2010, 6:46pm

Michael Pavling wrote in post #960375: [...]

How can I generate such an output? I have really no clue how to start.

You would do this by using SQL aggregate functions (in this case, COUNT(*) ). If you don't know about those, then your friendly local SQL reference is your first stop.

In addition to using the SQL functions, Ruby and Rails both include functions for grouping (and sort, if you wish) enumerable objects. Have a look at: Enumerable module Enumerable - RDoc Documentation

The choice as to whether to do the same thing in code or in the DB is down to you, and your evaluation of the efficiency/benefit one may give you over the other (personal preference comes into it a lot too!

However, here's a rule of thumb: *never* use Ruby for things like a simple count of DB records. Let the DB do the things that it's good at. That includes mass operations on large amounts of data. Complex calculations and application programming are generally better done in the app layer, of course.

Doing a count of DB records on the Ruby side? I wouldn't dignify that with the term "personal preference".

Best,

Michael_Pavling · November 9, 2010, 7:04pm

As a thought-experiment, how would you handle several counts in a row on a rapidly changing table for some form of end-user report? If I do one SQL query for a count by one set of criteria, and then do another query for a similar query, but just grouped differently, I may get different numbers because the underlying records in the table have changed.

If I do the same thing in code; I do one DB query, and then use the Ruby/Rails methods to manipulate that data how I need, and I can always be assured that I am at least *always* operating on the set same records for that request.

This is the kind of situation that I have a "personal preference" to do in code, not in the DB (although the DB may be better *at* it, there's a chance my numbers won't be the same which will confuse users).

11155 · November 9, 2010, 7:14pm

Michael Pavling wrote in post #960385:

However, here's a rule of thumb: *never* use Ruby for things like a simple count of DB records. Let the DB do the things that it's good at.

Doing a count of DB records on the Ruby side? I wouldn't dignify that with the term "personal preference".

As a thought-experiment, how would you handle several counts in a row on a rapidly changing table for some form of end-user report? If I do one SQL query for a count by one set of criteria, and then do another query for a similar query, but just grouped differently, I may get different numbers because the underlying records in the table have changed.

That's what transactions are for -- guaranteeing atomicity and consistent state over multiple queries. This is a perfect use case for them:

User.transaction do mikes = User.count :conditions => {:name => 'michael'} active = User.count :conditions => {:active => true} end

If I do the same thing in code; I do one DB query, and then use the Ruby/Rails methods to manipulate that data how I need, and I can always be assured that I am at least *always* operating on the set same records for that request.

But you're doing far too big a query. You're asking for (let's say) thousands of records, which Ruby will then parse in memory.

If you do the count on the DB side, then the DB will be able to use whatever indices and performance hacks are available to it to do the counting more quickly -- and yes, more maintainably -- then your Ruby application could.

This is the kind of situation that I have a "personal preference" to do in code, not in the DB (although the DB may be better *at* it, there's a chance my numbers won't be the same which will confuse users).

Then your preference is, I think, poorly founded, as it does not take the facts of the situation fully into account, and does not take full advantage of the power of the DB.

Do you disagree with my reasoning? If so, why?

Best,

Michael_Pavling · November 9, 2010, 7:47pm

Not at all... quite the opposite; I've just slapped myself on the forehead. I can't believe it's never occurred to me to use transactions for data-retrieval queries (use them for updating and inserting all the time)

Regards,

pepe1 · November 10, 2010, 1:48pm

+1

Colin_Law1 · November 10, 2010, 2:09pm

+1

Does that mean that you agree that Michael should slap himself on the forehead? Or possibly if the reader is Michael.

Colin

pepe1 · November 15, 2010, 4:13pm

> +1

Does that mean that you agree that Michael should slap himself on the forehead? Or possibly if the reader is Michael.

Colin

First option. It also increases the count of slapped foreheads.

I haven't tested what Marnen suggested yet, though. I'm sure it will work but I just have to see for myself.

Topic		Replies	Views
How to count a group of data rubyonrails-talk	0	173	November 9, 2010
Activerecord.count calculation not reurning expected output rubyonrails-talk	0	146	May 8, 2009
Count by group optimization. rubyonrails-talk	0	78	July 19, 2007
Retrieving count of individual items among the table rubyonrails-talk	9	407	June 7, 2009
ActiveRecord count with grouped relations rubyonrails-core	0	228	October 18, 2013

How to count a group of data

Related topics

More Resources