Negative find : trying to find what is NOT there :-)

Hi,

Sorry about the confusing title. I'm feeling a bit frustrated about Ruby now..

Ok, I have a membership database appication. For one view, I would
like to show all members who has NOT payed for the current term :slight_smile:

So, with the models member and fee_payment, (one member has many
fee_payments) I try to get the exklusion by finding all members,
finding all fee_payments for the current term, and exclude the members
associated with a fee_payment in the fee_payments list from the list
of members => the members who has not payed.

So, this is what I tried:

    @maybe_active_members = @club.members.find(:all,:conditions =>
["is_active = ?",true ], :order => "surname" )

    @term = "VT2008" #Hard-coded term

   #Find all mayments from the current term

    @fps = FeePayment.find(:all, :conditions => ["term = ? AND club_id
= ?", @term , @club])
    #Get an array of the member associated with the fee_payment
    @fpm = @fps.each {|f | f.member }

   #Exlude members who has payed from the list
    @active_members = @maybe_active_members.reject {|m | @fpm.include?(m) }

Arrgh! What am I doing wrong? Is there a simpler way?

NOT very good with this (or any other) stuff..

/Fredrik

Sorry about the confusing title. I'm feeling a bit frustrated about Ruby now..

Ok, I have a membership database appication. For one view, I would
like to show all members who has NOT payed for the current term :slight_smile:

So, with the models member and fee_payment, (one member has many
fee_payments) I try to get the exklusion by finding all members,
finding all fee_payments for the current term, and exclude the members
associated with a fee_payment in the fee_payments list from the list
of members => the members who has not payed.

So, this is what I tried:

   @maybe_active_members = @club.members.find(:all,:conditions =>
["is_active = ?",true ], :order => "surname" )

   @term = "VT2008" #Hard-coded term

  #Find all mayments from the current term

   @fps = FeePayment.find(:all, :conditions => ["term = ? AND club_id
= ?", @term , @club])
   #Get an array of the member associated with the fee_payment
   @fpm = @fps.each {|f | f.member }

  #Exlude members who has payed from the list
   @active_members = @maybe_active_members.reject {|m | @fpm.include?(m) }

Arrgh! What am I doing wrong? Is there a simpler way?

NOT very good with this (or any other) stuff..

You want an "outer join" query. http://en.wikipedia.org/wiki/Join_(SQL)#Outer_joins

You want to return all the members and their payment info for that term. If no payment, return blank data. Then filter on that.

It's doable with AR by using :include and :conditions appropriately.

-philip

Hi,

Thank you for your quick relply. Ok, I need an OUTER JOIN.. but how do
I do one in rails? If I do:

@out = @club.members.find(:all,:conditions => [" term = ?" ,"VT2008"
], :include => :fee_payments)

I get the number of FeePayments correctly.. however, If I do

@out = @club.members.find(:all,:conditions => [" term != ?" ,"VT2008"
], :include => :fee_payments)

I get only 1 element (where there should be > 200 of them. Hmm.. how come?

/Fredrik

Thank you for your quick relply. Ok, I need an OUTER JOIN.. but how do
I do one in rails? If I do:

@out = @club.members.find(:all,:conditions => [" term = ?" ,"VT2008"
], :include => :fee_payments)

You want this one and then filter the result only looking at cases where
the fee_payment is nil.

What's the SQL in your dev log for this?

Hi!

Thanks for all your help. This became my solution to the problem
(could not get outer joins to work):

    @maybe_active_members = @club.members.find(:all,:conditions =>
["is_active = ?",true], :order => "surname" )

    @has_payed = @club.members.find(:all,:conditions => ["is_active =
? AND term = ?",true, @term ], :include => "fee_payments", :order =>
"surname" )

    @active_members = @maybe_active_members.reject {|m |
@has_payed.include?(m) }

Seems to work (unless anyone sees a problem with the code..)

The SQL I got was

Member Load Including Associations (0.009988) SELECT members."id" AS
t0_r0, members."given_name" AS t0_r1, members."surname" AS t0_r2,
members."date_of_birth" AS t0_r3, members."gender" AS t0_r4,
members."primary_adress" AS t0_r5, members."primary_zip" AS t0_r6,
members."primary_city" AS t0_r7, members."primary_email" AS t0_r8,
members."primary_telephone" AS t0_r9, members."primary_mobile" AS
t0_r10, members."secondary_adress" AS t0_r11, members."secondary_zip"
AS t0_r12, members."secondary_city" AS t0_r13,
members."secondary_email" AS t0_r14, members."secondary_telephone" AS
t0_r15, members."secondary_mobile" AS t0_r16, members."is_active" AS
t0_r17, members."club_id" AS t0_r18, members."created_at" AS t0_r19,
members."updated_at" AS t0_r20, fee_payments."id" AS t1_r0,
fee_payments."member_id" AS t1_r1, fee_payments."club_id" AS t1_r2,
fee_payments."term" AS t1_r3, fee_payments."created_at" AS t1_r4,
fee_payments."updated_at" AS t1_r5 FROM members LEFT OUTER JOIN
fee_payments ON fee_payments.member_id = members.id WHERE
(members.club_id = 1 AND (is_active = 't' AND term = 'VT2008')) ORDER
BY surname

/Fredrik

you should try to avoid hitting ruby to strip out the unnecessary
records. Your database is streamlined to execute these types of
queries, so try to use it before falling back on code.

In one of my apps, I've got users and orders. To display all the
users who have not placed an order, I can use something like the
following:

select id from users where not exists (select * from orders where
user_id = users.id);

or

select id from users where id not in (select user_id from orders);

You'll have to use find_by_sql to use the above queries, but like I
said, it's much faster than using code to iterate and reject your
returned rows. Especially since if you look at your code, you're
iterating over each element of @maybe_active_members (N elements in
total), and for each of those elements, you're scanning through
another list of N elements, in the array @has_payed. So you're
executing a O(n^2) algorithm every time this code is run, which is
bad, not to mention that you're running two sql queries before hand.

You may hear others say that premature optimization is wrong, and I
agree, but in this case, I think it's important to understand the
difference between tuning your query using sql (preferred) or just
using brute force through ruby (last resort). So even if you don't
make this change now, at least you'll know that in the future when you
have hundreds of thousands of users, that this will potentially be a
bottleneck.

Adam

Alternatively, using an ANSI style join:

select users.id
from users
left join orders on orders.user_id = users.id
where orders.user_id is null

Additionally, you could abstract your hand crafted query to RoR through a database view or such. That way RoR doesn't need to know about your SQL and your SQL doesn't need to be embedded in RoR.

The best of both world really :slight_smile:

Hi,

I agree. I would like to reduce the DB access to one and perhaps to it
in SQL instead. However, right now, I don't know what kind of DB I
will be running this on. Also, for my application, the n woud
reastivally be about 100. Did some simulations with generated members,
and sorting 449 from 450 does not seem to be a problem. Even in sqlite
on a tiny Macbook :slight_smile:
So, I wil make a note of the possibility of focusing on this point
when optimization is needed.. but continuing on to complete the app
instead.
:slight_smile: Database independence seems to be more important right now.

But, Thanks ! I just love the quality of feedback I get on this list.

/Fredrik

Out of curiosity, does your project contains any database specific DDL or DML? Perhaps some create table, index, partition, constrain, sequence, analytic, statistic, or such?

Or is it fully driven end-to-end by RoR's ORM? E.g. all the database artifacts, as well as the database itself is generated, managed by RoR, from cradle-to-grave?

If your project already have database specific artifacts, it is but a small step to re-factor some of the more advance DML away from RoR and expose them back to RoR in an ORM friendly way such as a view or cursor or whatnot.

After all, what's the point of an ORM but to shield you away from common SQL trivialities?

On the other hand, if an ORM starts mimicking most of SQL peculiarities, perhaps something has gone terribly wrong along the way :slight_smile:

Hi,

Acctualy, I have been able to not use any database specific structures
(so far, knock on wood) and only use the fascilities that RoR ORM and
migrations provides me with. I have donw simulations using 1.5 as many
simutaneous objects as I could possible think would be used at the
same time using this app, and there is a slight lag. However, since
these simlations then involve a data set that is 8 times the size of
the data set that would currently be handled :slight_smile: I will, I think, opt
out of any early database specific optimizations for the time beeing.

For instance, adding storage in the parent model would easilly bring
it back to O(n) rather than O(n*m), but make the app less beautiful
:slight_smile:

/Fredrik

Fredrik Karlsson wrote:

Ok, I have a membership database appication. For one view, I would
like to show all members who has NOT payed for the current term :slight_smile:

The discussion in this thread seems pretty useful, but here's how I
tackled it using AR as much as possible.

http://blog.hasmanythrough.com/2006/9/9/finding-unassociated-objects