out of memory generating huge csv from active record

I am trying to work with generating really large CSV files from
active record. (This is actually an end case test however) I am trying
this with
a set of active records that is 700000 records which is just a test
case that I have, though it is very large. The type of find() below is
supposed to work in pages and not have all active records in memory. I
get an out of memory error (see stack dump below). The print of the
count also never comes out.

I got that info from here in the first part of "nailing down the root
cause"
http://www.engineyard.com/blog/2009/thats-not-a-memory-leak-its-bloat/

I am also using faster_csv which is supposed to not use as much memory
from soem other google searches.

I am using jruby 1.6.7 (ruby 1.8.7) and active record 2.3 (which I
assume corresponds to rails 2,.3)

on the call here, attr =
{:conditions=>["cdr_guid_id = :cdr_guid_id", {:cdr_guid_id=>30}]}

  def self.export_to_csv(attr)
    self.set_client(attr[:client])
    # client is our own thing,
    # and not part of active record find()
    attr.delete(:client)

    cnt = 0
    if hrec = self.find(:first)
      CSV.generate(path) do |ofil|
        ofil << hrec.visible_attributes.keys
        self.find(:all, attr).each do |rec|
          cnt += 1
          puts cnt.to_s if cnt % 100 == 0
          ofil << rec.visible_attributes.keys.map{|col|
rec.send(col) }
        end
      end
    end
  end

No, your code will load all records:

find(:all).each does that.

Either you change to find_in_batches or find_each

See documentation:

http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

Thanks, I have tried the find_each and find_in_batches. I seem to get
a local jump error, yield out of block. Why would that be ?

The approach here is to open the csv for appending after the first
batch, but it doesn't work thus far due to the error.

  def self.export_to_csv(attr)
    self.set_client(attr[:client])
    # client is our own thing,
    # and not part of active record find()
    attr.delete(:client)

    cnt = 0
    path = attr[:path]
    attr.delete(:path)

    if hrec = self.find(:first)
      mode = 'w'
      self.find_in_batches(attr).each do |recs|
        CSV.open(path,mode) do |ofil|
        # CSV.open(path, 'w') do |ofil|
        # ofil << rec.class.column_names
          p attr
          ofil << hrec.visible_attributes.keys
          cnt += 1
          puts cnt.to_s if cnt % 100 == 0
          recs.each do |rec|
          # ofil << rec.class.column_names.map{|col| rec.send(col) }
            ofil << rec.visible_attributes.keys.map{|col|
rec.send(col) }
          end
        end
        mode = 'a'
      end
    end
  end

LocalJumpError - yield called out of block:
C:/Users/lgu/tool
s/recordset/vendor/activerecord-2.3.8/lib/active_record/batches.rb:
66:in `find_i
n_batches'
./recordset_models.rb:461:in `export_to_csv'
./recordset_models.rb:368:in `export_to_csv'
recordset.rb:72:in `__file__'
org/jruby/RubyProc.java:270:in `call'
org/jruby/RubyMethod.java:129:in `call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1151:in
`compile!'
org/jruby/RubyKernel.java:2045:in `instance_eval'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:724:in
`route_eval'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:708:in
`route!'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:758:in
`process_route'
org/jruby/RubyKernel.java:1183:in `catch'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:755:in
`process_route'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:707:in
`route!'
org/jruby/RubyArray.java:1615:in `each'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:706:in
`route!'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:843:in
`dispatch!'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:644:in
`call!'
org/jruby/RubyKernel.java:2045:in `instance_eval'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:808:in
`invoke'
org/jruby/RubyKernel.java:1183:in `catch'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:808:in
`invoke'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:644:in
`call!'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:629:in
`call'
C:/Users/lguild/lguild_BED-L-LGUILD_361/lguild_BED-L-LGUILD_361/TAAS/
Trunk/tool
s/recordset/vendor/nakajima-rack-flash-0.1.0/lib/rack/flash.rb:154:in
`call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/
abstract/id.r
b:195:in `context'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/
abstract/id.r
b:190:in `call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/head.rb:
9:in `call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/
commonlogger.rb:20:in
`call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
showexceptions.
rb:21:in `call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/
methodoverride.rb:24:
in `call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1272:in
`call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1303:in
`synchronize'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1272:in
`call'
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/handler/
webrick.rb:59
:in `service'
c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:104:in `service'
c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:65:in `run'
c:/jruby-1.6.7/lib/ruby/1.8/webrick/server.rb:173:in `start_thread'
org/jruby/RubyProc.java:270:in `call'
org/jruby/RubyProc.java:224:in `call'
127.0.0.1 - - [08/May/2012:10:37:07 EDT] "GET /csv_file/3764/
e09d5eeee791ac74ff3
25e2163d1712226e466e3.csv HTTP/1.1" 500 91568
http://localhost:4567/ -> /csv_file/3764/
e09d5eeee791ac74ff325e2163d1712226e466e
3.csv

apparently I did not want the each():

self.find_in_batches(attr).each do |recs|

but rather

self.find_in_batches(attr) do |recs|