True ActiveRecord result set iteration

Hello,

For an internal project we were looking for a solution which extends ActiveRecord by an iterator function. Using find(:all) is no fun if you have to process some 10.000 records.

Until now there have been two ways of dealing with that scenario: - write your logic a second times (e.g. use stored procedure) - bet on AR plugins which work around that limitation by fetching IDs and processing them in groups, or by using :offset and :limit

Rewriting logic is something we wanted to avoid, and the plugins don't fully respect transactional contexts. So we started to implement our own true iterator support for AR. Our project is on Oracle, but in the meantime we have added support for MySQL. Probably other adapters can be extended easily, too. We also tried JRuby 1.1.x, which is sometimes faster than Ruby 1.8.6, but a patch is needed to bring the Java part of the connection adapter into shape for a result set iteration.

Okay, you're about to ask: how does it work. Here we go:

MyTable.find_each_by_sql("some SQL here") { |my_table| ... }

MyTable.find_each(:all, :conditions => ...,    :include => ...) { |my_table| ... }

Attached you find the magic code which can be used as a plugin for Rails. When testing, please keep in mind that only Oracle and MySQL is fully supported. JDBC will take lots of RAM for large result sets until you have patched the JdbcConnectionAdapter.

Some figures with JRuby: I've tested the code for an export of ~80.000 customer data records. Originally I couldn't run the export with heap space less than 2 GB (JRuby 1.1.4 without extensive garbage collection). After having patched the connection adapter, it works with less than 128 MB heap space (JRuby 1.1).

I'd be happy if our idea would be picked up and AR would get these iterator methods integrated. I've seen lots of people asking for exactly this behavior. It's possible to implement, it's easy to implement, and IMHO it doesn't break the AR metaphor.

If you like the idea and want to send feedback, please CC me. I'm not subscribed to the list.

Regards, Andreas

active_record_iterator.tgz (6.98 KB)

I'd be happy if our idea would be picked up and AR would get these iterator methods integrated. I've seen lots of people asking for exactly this behavior. It's possible to implement, it's easy to implement, and IMHO it doesn't break the AR metaphor.

You might want to discuss this on the rubyonrails-core list. It's where people who want to discuss development of rails itself hang out.

Fred