improved find_in_batches

Cody_Cutrer · November 25, 2013, 9:00pm

find_in_batches has several major limitations -

order and limit are not supported
joins can break it (i.e. if there end being multiple records with the same primary key, the next batch might miss some that were truncated by the limit)
it forces a table scan because it’s ordering by primary key, making it inefficient

All of these can be worked around by either using cursors or temporary tables. Would a patch to automatically use such features (if the need for it is detected, like where it is currently warning about order and limit) be accepted? What would the suggested way to structure such a patch, given that it would use DB specific features? Add some stubs to SchemaStatements or AbstractAdapter, and call them from find_in_batches? I’m guessing detecting the adapter inline is frowned upon.

For reference, we’re using such an implementation right now in our project (Rails 2.3; we’re in the process of upgrading): canvas-lms/active_record.rb at release/2013-11-16.13 · instructure/canvas-lms · GitHub.

Cody Cutrer

Topic		Replies	Views
[Feature Request] [Active record] - Add order option to "find_in_batches" and rubyonrails-core	3	234	May 9, 2018
[Feature request] [Active record] - Add order option to "find_in_batches" and "find_each" rubyonrails-core	0	202	January 28, 2018
I can not use find_each to write huge csv because find_in_batch ignores order scope. rubyonrails-core	1	236	March 4, 2014
ActiverRecord find_in_batches to take option[:order] rubyonrails-core	7	205	March 6, 2009
Allow find_in_batches to use :order, :limit, and :offset rubyonrails-core patch	1	292	March 6, 2009

improved find_in_batches

Related topics

More Resources