AR::Base#exists?

exists?
AR::Base.exists? used to basically say !find(:first, :conditions =>)...

It now (effectively) says !find(:first, :select => 'id', :conditions => )

I imagine this is to save on instantiating a large object with many columns just to throw it away. This is all good and well, but has broken some things for us: on some models we have after_find callbacks that rely on the presence of attributes. In our particular cases we just moved to setting up those things lazily, you could also check that the attributes are actually here. However morally I wouldn't expect exists? to instantiate anything at all, which would make the problem go away (and in a sense take the above change a little further).

Rewriting exists? as

def exists?(id_or_conditions)
   sql = construct_finder_sql(:first, :select => "#{quoted_table_name}.#{primary_key}", :conditions => expand_id_conditions(id_or_conditions) :limit =>1)
   connection.select_all(sanitize_sql(sql)).size > 0
end

does this, but it's rather more verbose, so I'm not sure it's actually worth it.
Thoughts?

Fred

I'm the author of the new exists? code, and I also looked at doing it with a lower-level connection call instead of a find with :select option. I don't have the data around anymore, but as I recall, the benchmarks showed that using the select_all approach didn't perform as well, so I went with the find with :select approach. I hadn't thought about the after_find callback interaction, and I guess no one else has run into that problem before now either. Some stuff has changed in AR that might have affected performance of the select_all approach since I tried it last, so it might be worth revisiting now.

I'm the author of the new exists? code, and I also looked at doing it
with a lower-level connection call instead of a find with :select
option. I don't have the data around anymore, but as I recall, the
benchmarks showed that using the select_all approach didn't perform as
well, so I went with the find with :select approach. I hadn't thought
about the after_find callback interaction, and I guess no one else has
run into that problem before now either. Some stuff has changed in AR
that might have affected performance of the select_all approach since
I tried it last, so it might be worth revisiting now.

Well I ended up with that code by taking apart find :first and
removing the instantiation of rows at the end, so I don't think it
should be a problem.
What I posted should be doing exactly what find :first does, but
without the instantiation

Fred

Well I ended up with that code by taking apart find :first and
removing the instantiation of rows at the end, so I don't think it
should be a problem.
What I posted should be doing exactly what find :first does, but
without the instantiation

Why not just use count?

count(:id, :conditions =>...) > 0

A count makes the database do a lot more work as it must count all
matching rows not just find the first one that matches. Depending
on the exact conditions used and the database schema/indices/size the
difference can be very large.

I ran some numbers today:

ids = (1..10000).map { rand 2000000}

Benchmark.bmbm do |x|
   x.report("exists") { ids.each {|id| Question.exists? id} }
end

Current trunk (find:first)

Rehearsal ------------------------------------------
exists 4.700000 0.620000 5.320000 ( 6.437581)
--------------------------------- total: 5.320000sec

              user system total real
exists 4.690000 0.620000 5.310000 ( 6.574966)

select_all

Rehearsal ------------------------------------------
exists 4.650000 0.740000 5.390000 ( 8.984423)
--------------------------------- total: 5.390000sec

              user system total real
exists 4.010000 0.590000 4.600000 ( 5.606526)

Not mind blowing and it's obviously not going to make a massive difference to anyone, but for me at least sets aside any performance worries.

Fred

Hey Fred.

I've replied to your ticket, given you my +1, and made a suggestion in
the form of another patch. This is a good idea.

James

Hey Fred.

I've replied to your ticket, given you my +1, and made a suggestion in
the form of another patch. This is a good idea.

Cool.Not sure why but I wasn't able to make it look nice without
splitting it into 2 but you change looks good.

Fred

Hi Fred,

I ran your benchmark and I get similar results. That is, I can't
duplicate hm:j's findings that going to the raw connection is slower.

So +1 from me, and as James H has done, I've put an alternate patch
there - use select_value rather than select_all.

Trevor

Could someone cast an eye on the (3 +1s) patch at http://dev.rubyonrails.org/ticket/10605
  ?

Thanks,

Fred