Updating a Field in All Records of a Model, Efficiently?

Hello Everyone,

I have a Model for which there are many thousands (hundreds of thousands) of records. I'd like to update the value of one field for each record in a rake task.

A ruby based function must be applied to this field for all records (i.e. can't perform the algorithm in an SQL UPDATE alone).

First Thought: MyModel.find(:all).each { |m| m.my_field = function_call_here; m.save } Load all the models into memory, and I'd like to avoid that.

Second Thought: Get the maximum "id" of the Model's records and program a loop, doing a "find_by_id" for each record. While the model does have an "id" field, some records have been deleted, so I would have to check that the "find_by_id" is nil...

I'm not worried about new records being added to the database as this update will be run during a "maintenance" period.

Anyone have any thoughts on how you would update every record for a model when you have a large number of records?

Thank you for your time and help.

-Jim

I may not understand your second thought, but I have a hard time imagining it will be faster than the loop over all records in your first thought. Since you've got to retrieve the record & make it an AR instance in any case, you may as well not mess around guessing IDs...

Are you sure you can't translate your ruby function to SQL? That's going to be loads faster... Can you break your records up into chunks that should all get the same value of my_field & do a series of subset UPDATEs maybe?

Jim,

When I've had to do a similar thing over a table with many (100,000+) records, I've done something like:

total = Model.count limit = [ 100, total ].min 0.step(total-1, limit) do |offset|    Model.find(:all, :limit => limit, :offset => offset).each do |model|      # do stuff    end end

If you have a condition that limits what comes back, you might have to tweak the offset if the "stuff" you do causes records to fall out of the condition.

Roy's idea of a series of Model.update_all() calls for each unique set of whatever goes into your function sounds like a winner if the function is predictable enough.

-Rob

Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com