There have been many times when I needed to work with a relation in small chunks. E.g. deleting or updating large number of rows while allowing the database to work on other tasks in between. This is specially useful if the number of affected rows is very large and we need to throttle our work in order to allow other process to start/complete in between each iteration. It could also be used to process each slice using concurrent workers.
People.where(‘age > 21’).each_slice do |relation|
sleep 10 # other tasks
Called without a block, it will return an Enumerator, e.g.:
Poeple.where(‘age < 18’).each_slice.each(&:delete_all)
Other people have expressed interested in having such functionality as well. E.g., with a quick GitHub Issues search I found issues #20820 and #13147. There have been other attempts to achieve the same goal, however they were backward-incompatible, not tested and not optimised.
I have written passing tests in a similar style to #find_in_batches. I have made sure that the number database queries is kept low (it is equal and in some cases less than that of #find_in_batches).
I have documented the method as well.
I have tried to stick to the coding style and conventions of Rails. Please let me know if I need to make any modifications, or rename it so that I could get prepare it for getting merged.
PS. let me know what name you prefer for the method, e.g. #batches, #to_batches, #in_batches, #split, #each_slice, etc. So far, I have called it #each_slice because it is dividing a single ActiveRecord::Relation object into multiple ActiveRecord::Relation objects, similar to the more familiar Array#each_slice, which divides an Enumerable into multiple Enumerable objects.
Here is the link to the the pull request:
Thanks in advance for your help!