Active Record’s current behavior is to load an entire association when calling finder methods like first
and last
on a dirty association. I’m concerned this leaves a hazard for developers to stumble over: in a case where @category.products
would load millions of records into memory, calling @category.products.build
and then @category.products.first
(or last
) would easily trigger an out-of-memory error:
# unloaded association
@category.products.first
# SELECT products.* FROM products
# WHERE products.category_id = 1
# ORDER BY products.id ASC LIMIT 1
# dirty association
@category.products.build
@category.products.first
# SELECT products.* FROM products
# WHERE products.category_id = 1
# (Note the absence of a LIMIT clause on the second query.)
(see issue 39455)
I’d like to cut down on memory consumption by only pulling the necessary number of records (rather than fully loading an association) when calling @category.products.first
, @category.products.first(5)
, @category.products.last
, etc. Here’s my PR:
https://github.com/rails/rails/pull/39627
However, Ryuta Kamizono (@kamipo) raised an interesting flag: calling first
on a loaded and unloaded association won’t always return the same record, because records are given a default ordering when calling first
on an unloaded association, while loaded associations receive no additional ordering.
So, the change I’m proposing slightly alters the behavior of first
. I appreciate this concern, but I feel the benefit of this change outweighs the cost. (Plus, I might respectfully assert that the existing behavior is inconsistent and separate from the issue I’m trying to tackle.)
I took a break from working on this about a month ago, as my PR wasn’t getting much attention. But before setting it aside entirely and moving onto another project, I wanted to start a thread here, and lobby to include my change in the next minor release, where some behavior changes may be justified.
Thanks for the consideration!