RFC: Remove support for passing argument to force association reload

I already asked a question about refactoring record.associations(true) -> record.associations(reload: true) here: https://groups.google.com/forum/#!topic/rubyonrails-core/f756F2DLuG0

However, Eugene raises an interesting question in the PR (https://github.com/rails/rails/pull/20883#issuecomment-121419119) that it might actually make sense to just deprecate the support for record.associations(true), and asks user to do record.associations.reload instead.

I think it make sense, but I want to hear how people think first. Internally, it actually does the same thing because the current association reader code actually calls #reload internally as well.

So, do you think it’s a good idea to just deprecate and later remove the support for record.associations(true) instead?

Thanks,

Prem

Personally I find the double meaning of #reload a bit confusing for singular associations, I would expect record.association.reload to reload the current instance of the target object, but record.association(reload: true) to reload the association itself. The behavior is different if the associated object has changed, for example if the record matching a has_one has changed.

In general I think it’s a bad idea for singular association proxy objects to intercept any instance methods. So I wouldn’t want the has_many associations to do it either, for consistency.

Can someone provide a link to the area of the Rails API or documentation you are discussing.

Search for has_one and has_many on api.rubyonrails.org.

thanks

Can’t you chain like record.associations(reload: true).where - if you do reload I think we’d lose that

Will, You actually made a good point there. Removing it would break the support for reloading has_one relationship. Take a look at these SQLs:

student = Student.first

Student Load (0.5ms) SELECT “students”.* FROM “students” ORDER BY “students”.“id” ASC LIMIT 1

student.user

User Load (0.6ms) SELECT “users”.* FROM “users” WHERE “users”.“profile_id” = $1 AND “users”.“profile_type” = $2 LIMIT 1 [[“profile_id”, 1], [“profile_type”, “Student”]]

student.user(true)

User Load (0.4ms) SELECT “users”.* FROM “users” WHERE “users”.“profile_id” = $1 AND “users”.“profile_type” = $2 LIMIT 1 [[“profile_id”, 1], [“profile_type”, “Student”]]

student.user.reload

User Load (0.5ms) SELECT “users”.* FROM “users” WHERE “users”.“id” = $1 LIMIT 1 [[“id”, 1]]

``

For Kevin, chaining it in has_many actually works as reload would still return a CollectionProxy. However, as it does not work for has_one I think It does not make sense to remove it.

In that case, I think we have keep it and just add associations(reload: true) to make it more obvious/verbose.

Thanks for all the comments.

-Prem

I’d prefer to see us move to a single API, and IMO the trade-offs for #reload alone fits the bill. There’s no contract saying that a singular object from has_one/belongs_to and a collection like has_many has to behave the same. In fact, I think it’d be strange to think that it should. A single string and an array of strings do not behave the same.

So project.documents.reload.first makes perfect sense to me as reloading the documents collection, then grabbing the first one. projects.owner.reload.name makes sense to me as reloading the owner record itself, then fetching the name (if we don’t already return self from Record#reload, we should).

I agree that having a single API is nice. However, I feel like it would be confusing if user want to reload the has_one association, as they have to reload the original record instead of just passing true.

Consider this scenario:


# User has_one :profile, Profile belongs_to :user
> user = User.first
> user.profile #=> #<Profile id: 1, user_id: 1>
> Profile.find(1).update_attrbutes!(user_id: 2)
> Profile.find(1) #=> #<Profile id: 1, user_id: 2>
> user.profile #=> #<Profile id: 1, user_id: 1>
> user.profile.reload #=> #<Profile id: 1, user_id: 2> # << ?!
> user.profile(true) #=> nil

You can see that by removing the support for passing an argument to force reload makes it more confusing as you get a stale record. The workaround is to call user.reload.profile instead.

If you think that workaround is good enough, I don’t mind submit another PR to remove the support to simplify the API, though.

-Prem

DHH wrote:

I’m content with that work-around. I think it’s a very rare situation. Let’s make common things easy and uncommon things possible.

Alright, I’m sold. Let me submit another PR to deprecate that then.

Thank you for your thoughts, everyone.

-Prem

Another idea for the work-around is to add a reload_* association method, e.g., user.reload_profile.

I’m not sure it’s worth adding yet another generated association method (or even a good idea), but that would make the reload interface consistent with the build_* and create_* association methods, as I remember them.

-Al

So are you saying project.documents.reload.first would redo the query to find the associated record, or would do a find(id) to reload the current instance?

project.documents.reload.first would reload the documents association, then grab the first entry in that array – not trigger another find(id). Like calling #load, but ignoring whether it had already been loaded.

OK, so what about the has_one case?

Say Project has_one :manager, and project.manager is already loaded (which did a “SELECT * FROM managers WHERE project_id = ?”), will project.manager.reload do another “SELECT * FROM managers WHERE project_id = ?”, or will it do a "SELECT * FROM managers WHERE id = ?”.

To match the has_many I would expect the former, and sometimes we need this behavior if this is the only way to force the association to be looked up again (whereas currently we can do project.manager(true)).

But because the object returned by project.manager looks like it is just a plain record instance, I’d expect #reload to behave like the query, a regular find(id). Especially if I call reload(:lock => true).

It seems a big ambiguous what the behavior would be for has_one, which makes me think the original project.manager(:reload => true) is clearer - and then reload on the target of a belongs_to or has_one should always just reload that instance, not load a potentially different record.

Do you agree?

I think the first assumption to challenge is that we can have parity between a collection and a single object. I don’t think we can or should. An array of strings does not have parity with a single string.

So project.manager.reload will call ActiveRecord::Base#reload, so that’s a Manager.find(id) call, not a Manager.find_by(project: x) call. If you absolutely need the latter, you can do project.reload.manager. I’ve found this to be an exceedingly rare case, though. So I’m find have it be a round-about to get there, and I’m fine not having parity between collection and record.

I think you’re right that we can’t have complete parity, but then as you said, it would be good to have a single API.

It does works well right now having the same API to always get the fresh association loaded, and it is useful to be able to definitely get the current record, even if you don’t personally use it that often :).

There is a practical difference between the current project.manager(true) which always does exactly one query, whether or not manager is already loaded, and project.reload.manager, which always does two queries and also changes other stuff on the project instance.

Surely it is not necessary to take away the ability to do exactly one association reload?

Otherwise you could just as equally argue that there is no use for reload on a has_many. Personally I think I reload has_ones as often as I reload has_manys, and for the same reasons - I need to be sure I am dealing with the current child/children of an object I am halfway through doing stuff to (which is why I don’t want to reload the parent).

I think Al’s suggestion here is a good one. It gets away from the single API to reload any type of association, but at least it’s possible to do the association reload when ya need it, and it’s crystal clear what is happening.

I’d be happy to consider some real code from a real project that shows the use of reloading a single has_one association in a performance hotspot where two ID lookups are proving to be a problem. But for the time being, I’m content with the trade off that has collection#reload and record#reload as the two only ways to reloading those associations. And if someone needs #reload_manager in their model because they do use that a lot, it’s trivial to implement on its own.

Do appreciate the debate over this, though! Good to go through the reasoning of why we do and make certain changes.

Oh, sorry, I explained that badly.

I’m not saying it’s a performance problem doing two queries. I’m saying it’s causing unintended side-effects, because when you do the query on the parent it resets the attributes, including blowing away unsaved changes. This is absolutely undesirable if you just want to make sure you have the current child.

How is it trivial to implement reload_manager on its own, BTW? You would have to write out the foreign key SQL, do the query, and then assign back to the association?

I’m comfortable with those trade-offs. Barring any compelling real world code shining a different light on the discussion than what we’ve covered so far, I don’t think we’re going to make any additional progress discussing it.