Strategy when dealing with LARGE data sets

I am a new convert to RoR and am loving every minute of it. Absolute highest praise goes to everyone who has contributed.

An issue has come up when dealing with tables with large numbers of rows and columns. Specifically, i have an issue in an ActiveRecord::Migration, where after adding a new column, I would like to iterate each row, read an attribute, process that attribute, then update the new attribute/column with the result.

however, doing something like this: Stuff.find_all { |r| r['new_attr'] = process r['old_attr'] }

would kill the server returning so many records. I am aware of the :select parameter to find() but it would still kill the server with ALL records being returned in the array.

I see there is a :limit and :offset parameter to find(), which would work. I figured i would ask if there are there any other workable solutions to dealing with large data sets, before creating my own.

Thank you!

Jae

I am a new convert to RoR and am loving every minute of it. Absolute highest praise goes to everyone who has contributed.

An issue has come up when dealing with tables with large numbers of rows and columns. Specifically, i have an issue in an ActiveRecord::Migration, where after adding a new column, I would like to iterate each row, read an attribute, process that attribute, then update the new attribute/column with the result.

however, doing something like this: Stuff.find_all { |r| r['new_attr'] = process r['old_attr'] }

would kill the server returning so many records. I am aware of the :select parameter to find() but it would still kill the server with ALL records being returned in the array.

I see there is a :limit and :offset parameter to find(), which would work. I figured i would ask if there are there any other workable solutions to dealing with large data sets, before creating my own.

Just saw this on the rss feed yesterday which might work for you...

Plugins - paginating_find - Agile Web Development http://www.agilewebdevelopment.com/plugins/paginating_find

Got 15,842 records that you.d like to export to a file? Using the standard the Rails ActiveRecord::Base#find method will load all 15,842 into memory all at once and return them all in an array. If your app is running on a shared host, or if you.re keeping your app on a memory budget, this is a big problem for you. So you could load each record one by one, but that.ll kill your db server. Wouldn.t it be sweet if #find could return an enumerable that would load your records in batches of say 1,500 records? Well with my new nifty-jifty paginating_find plugin, it can.

-philip

Thanks Philip... i'm in nirvana. amazing. now i have to find out what plugins are.

it's amazing. a single line install, used it, tested it, it friggin works. amazing!

Jae

You may find the recently released paginating_find plugin helpful: http://agilewebdevelopment.com/plugins/paginating_find