out of memory generating huge csv from active record

I am trying to work with generating really large CSV files from active record. (This is actually an end case test however) I am trying this with a set of active records that is 700000 records which is just a test case that I have, though it is very large. The type of find() below is supposed to work in pages and not have all active records in memory. I get an out of memory error (see stack dump below). The print of the count also never comes out.

I got that info from here in the first part of "nailing down the root cause"

I am also using faster_csv which is supposed to not use as much memory from soem other google searches.

I am using jruby 1.6.7 (ruby 1.8.7) and active record 2.3 (which I assume corresponds to rails 2,.3)

on the call here, attr = {:conditions=>["cdr_guid_id = :cdr_guid_id", {:cdr_guid_id=>30}]}

  def self.export_to_csv(attr)     self.set_client(attr[:client])     # client is our own thing,     # and not part of active record find()     attr.delete(:client)

    cnt = 0     if hrec = self.find(:first)       CSV.generate(path) do |ofil|         ofil << hrec.visible_attributes.keys         self.find(:all, attr).each do |rec|           cnt += 1           puts cnt.to_s if cnt % 100 == 0           ofil << rec.visible_attributes.keys.map{|col| rec.send(col) }         end       end     end   end

No, your code will load all records:

find(:all).each does that.

Either you change to find_in_batches or find_each

See documentation:

http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

Thanks, I have tried the find_each and find_in_batches. I seem to get a local jump error, yield out of block. Why would that be ?

The approach here is to open the csv for appending after the first batch, but it doesn't work thus far due to the error.

  def self.export_to_csv(attr)     self.set_client(attr[:client])     # client is our own thing,     # and not part of active record find()     attr.delete(:client)

    cnt = 0     path = attr[:path]     attr.delete(:path)

    if hrec = self.find(:first)       mode = 'w'       self.find_in_batches(attr).each do |recs|         CSV.open(path,mode) do |ofil|         # CSV.open(path, 'w') do |ofil|         # ofil << rec.class.column_names           p attr           ofil << hrec.visible_attributes.keys           cnt += 1           puts cnt.to_s if cnt % 100 == 0           recs.each do |rec|           # ofil << rec.class.column_names.map{|col| rec.send(col) }             ofil << rec.visible_attributes.keys.map{|col| rec.send(col) }           end         end         mode = 'a'       end     end   end

LocalJumpError - yield called out of block: C:/Users/lgu/tool s/recordset/vendor/activerecord-2.3.8/lib/active_record/batches.rb: 66:in `find_i n_batches' ./recordset_models.rb:461:in `export_to_csv' ./recordset_models.rb:368:in `export_to_csv' recordset.rb:72:in `__file__' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyMethod.java:129:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1151:in `compile!' org/jruby/RubyKernel.java:2045:in `instance_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:724:in `route_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:708:in `route!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:758:in `process_route' org/jruby/RubyKernel.java:1183:in `catch' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:755:in `process_route' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:707:in `route!' org/jruby/RubyArray.java:1615:in `each' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:706:in `route!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:843:in `dispatch!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:644:in `call!' org/jruby/RubyKernel.java:2045:in `instance_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:808:in `invoke' org/jruby/RubyKernel.java:1183:in `catch' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:808:in `invoke' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:644:in `call!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:629:in `call' C:/Users/lguild/lguild_BED-L-LGUILD_361/lguild_BED-L-LGUILD_361/TAAS/ Trunk/tool s/recordset/vendor/nakajima-rack-flash-0.1.0/lib/rack/flash.rb:154:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/ abstract/id.r b:195:in `context' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/ abstract/id.r b:190:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/head.rb: 9:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/ commonlogger.rb:20:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ showexceptions. rb:21:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/ methodoverride.rb:24: in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1272:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1303:in `synchronize' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1272:in `call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/handler/ webrick.rb:59 :in `service' c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:104:in `service' c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:65:in `run' c:/jruby-1.6.7/lib/ruby/1.8/webrick/server.rb:173:in `start_thread' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyProc.java:224:in `call' 127.0.0.1 - - [08/May/2012:10:37:07 EDT] "GET /csv_file/3764/ e09d5eeee791ac74ff3 25e2163d1712226e466e3.csv HTTP/1.1" 500 91568 http://localhost:4567/ -> /csv_file/3764/ e09d5eeee791ac74ff325e2163d1712226e466e 3.csv

apparently I did not want the each():

self.find_in_batches(attr).each do |recs|

but rather

self.find_in_batches(attr) do |recs|