#all changes in 4.0

Hi guys,

I don't think that the changes made to the behavior of #all in 4.0 are a very good idea.

I can see that you no longer need to call all in as many cases as you did before - that's fine, just don't call it if you don't want it. But that doesn't mean you never need it or that people who do need it should not have it available.

1. Yes you can use to_a in many cases, but it behaves differently - for example if you have an association, to_a will return the cached target if the association has already been loaded. You absolutely need a way to run an actual query when you want the latest results. to_a cannot be relied upon to do this in all cases.

Note lack of a second query:

irb(main):006:0> p = Project.first   Project Load (0.2ms) SELECT "projects".* FROM "projects" ORDER BY "projects"."id" ASC LIMIT 1 => #<Project id: 1, created_at: "2013-02-27 09:38:49", updated_at: "2013-02-27 09:38:49"> irb(main):007:0> p.tasks.to_a   Task Load (0.2ms) SELECT "tasks".* FROM "tasks" WHERE "tasks"."project_id" = ? [["project_id", 1]] => [#<Task id: 1, project_id: 1, created_at: "2013-02-27 09:38:52", updated_at: "2013-02-27 09:38:52">, #<Task id: 2, project_id: 1, created_at: "2013-02-27 09:38:53", updated_at: "2013-02-27 09:38:53">] irb(main):008:0> p.tasks.to_a => [#<Task id: 1, project_id: 1, created_at: "2013-02-27 09:38:52", updated_at: "2013-02-27 09:38:52">, #<Task id: 2, project_id: 1, created_at: "2013-02-27 09:38:53", updated_at: "2013-02-27 09:38:53">] irb(main):010:0> p.tasks.all DEPRECATION WARNING: Relation#all is deprecated. If you want to eager-load a relation, you can call #load (e.g. `Post.where(published: true).load`). If you want to get an array of records from a relation, you can call #to_a (e.g. `Post.where(published: true).to_a`). (called from irb_binding at (irb):10) => [#<Task id: 1, project_id: 1, created_at: "2013-02-27 09:38:52", updated_at: "2013-02-27 09:38:52">, #<Task id: 2, project_id: 1, created_at: "2013-02-27 09:38:53", updated_at: "2013-02-27 09:38:53">] irb(main):011:0> p.tasks.all DEPRECATION WARNING: Relation#all is deprecated. If you want to eager-load a relation, you can call #load (e.g. `Post.where(published: true).load`). If you want to get an array of records from a relation, you can call #to_a (e.g. `Post.where(published: true).to_a`). (called from irb_binding at (irb):11) => [#<Task id: 1, project_id: 1, created_at: "2013-02-27 09:38:52", updated_at: "2013-02-27 09:38:52">, #<Task id: 2, project_id: 1, created_at: "2013-02-27 09:38:53", updated_at: "2013-02-27 09:38:53">]

2. It's very important that queries run at the point you think they do in any application that uses locks or concurrency. Again, if you don't use locks or concurrency, fine - don't call the query methods. But many people do and they need to be able to run the queries to make this work.

3. It's not true that you no longer need to care whether you have an array or a relation. For example, methods like sum with a block need arrays, as the deprecation makes clear:

irb(main):009:0> p.tasks.sum(&:id) DEPRECATION WARNING: Calling #sum with a block is deprecated and will be removed in Rails 4.1. If you want to perform sum calculation over the array of elements, use `to_a.sum(&block)`. (called from irb_binding at (irb):9)   Task Load (0.1ms) SELECT "tasks".* FROM "tasks" WHERE "tasks"."project_id" = ? [["project_id", 1]] => 3

4. It's true that making all basically useless means you can now call all on a model class itself and get a relation and then you can merge that or whatever, which was one of the other examples in the changelog. But you could do that already - using scoped. It is not necessary to break #all's behavior to get this functionality.

Have I misunderstood the change?

If not, can we please put back the query method? Running queries is a pretty core responsibility of ActiveRecord.

Thanks, Will

  1. You are using the wrong method. If you want the query always you call it you should use #load

  2. Using #load you will know exactly when the query is done

  3. #sum with block is not recommended since it will load all the object in memory. This is why it was deprecated.

The query method is there. It is called #load now.

I did some review in the code and in a relation, #load checks for loaded? so if the relation is still loaded it will not do the query. The only way right now to reload a relation is using #reload.

p.tasks(true) will *always* reload the association. I haven't seen the equivalent for relations, though.

--Matt Jones

I think Rafael has already answered your questions, but as the person who made the changes I'm happy to answer any further questions if you have them?

I’m not arguing that sum should take a block, I’m showing that there are still lots of cases in which you need an array.

Yes it’s bad to sum a column on a lot of objects as they will get loaded into memory. But we need to run sum on methods that are not columns too, and we use this with scopes that cut the number of objects down to a number that are fine to process in memory.

Hi Jon,

Unfortunately as per Matt and Rafael's second reply, there is no method that *always* runs a query, whether you have an association or a relation. We still need that.

Cheers, Will

Hi Jon,

Unfortunately as per Matt and Rafael's second reply, there is no method that *always* runs a query, whether you have an association or a relation. We still need that.

#reload will always run the query.

If I'm misunderstanding the use case please provide some examples.

Hmm. But you can't run reload on a scope to get an array - it returns a relation, which as per previous emails doesn't behave the same.

So are you saying we should use .reload.to_a everywhere instead of #all?

That really seems like a worse API than #all and this is a very common operation. Is it really worth changing #all to be nearly useless and have no direct to do that?

Could we at least have a method that does this, say "query"?

Or all(true)

That’s better than nothing, but #all returns a relation now, which is half the problem (like when you need to sum methods not columns, for eg.).

Most other “enterprise-y” languages (esp. Java+Hibernate, .NET+NHibernate, .NET+Entity Framework) reinforce deferred execution of queries: the query is not executed until it is enumerated. This is now the exact same behavior as those platforms.

Calling .sum() on an association, before you’ve enumerated, will alter the query to perform a SELECT SUM(…). Obviously, this fails on anything that doesn’t exist in the database. If you want to .sum() in memory, you must enumerate the collection first (calling .ToArray() in .NET will do this). I don’t know if the author of this deferred execution has experience with these other ORM’s, but the behavior is now identical.

Employee.all vs. Employee.all.to_a is not that big of a deal. And you should expect breaking changes with a major version number. That’s why your code is covered by tests, right?

Signed,

An-ex-.NET-developer-turned-Rubyist

Hi Jarrett,

As per previous emails, the problem is that you can’t now force it to do a query using any particular method. Associations will cache if you enumerate them and so will not behave in that simple way.

Yes we have tests and yes it does show that this kind of change breaks things. That’s why I’m complaining.

Cheers,

Will

http://edgeguides.rubyonrails.org/association_basics.html#controlling-caching

It looks like clearing the cache and going to the DB is quite easy. I don’t believe your statement is accurate that you can’t force a fresh, enumerated query. And the cache is reset per request, which is a very short time. The only thing I can think of that would require this kind of functionality is a database trigger or other out-of-sequence activity.

The thing that’s got worse is having to write different code for associations vs. relations. Currently #all will behave exactly the same way on both, which is very useful because you can write code on a model class that works the same way whether it’s working on a global scope or just an association.

BTW, the per-request caching mechanism is query caching, which is different to the loadedness caching that associations do, which is the reason to_a is not a consistent replacement for #all, and why I want at least a method that always does a query (#query, for example).

The query caching mechanism has other effects but it always returns a fresh set of objects and is also only on when you expect it, so it is easier to control.

#reload doesn’t work to you?

The thing that's got worse is having to write different code for associations vs. relations. Currently #all will behave exactly the same way on both, which is very useful because you can write code on a model class that works the same way whether it's working on a global scope or just an association.

I don't understand what you mean. Can you give an example of the two different bits of code you have to write?

BTW, the per-request caching mechanism is query caching, which is different to the loadedness caching that associations do, which is the reason to_a is not a consistent replacement for #all, and why I want at least a method that always does a query (#query, for example).

Again, I don't understand. Please provide before/after code so I can respond to a specific use case.

You have to use reload.to_a to work with everything (enumerable methods for eg.).

Can we have a method that does this and make it part of the stable API, please? That’s what #all has always done and it’s pretty basic functionality.

You have to use reload.to_a to work with everything (enumerable methods for eg.).

Can we have a method that does this and make it part of the stable API, please? That’s what #all has always done and it’s pretty basic functionality.

#reload doesn’t work to you?

You received this message because you are subscribed to the Google Groups “Ruby on Rails: Core” group.

To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-core+unsubscribe@googlegroups.com.

To post to this group, send email to rubyonrails-core@googlegroups.com.

Visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.

For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups “Ruby on Rails: Core” group.

To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-core+unsubscribe@googlegroups.com.

To post to this group, send email to rubyonrails-core@googlegroups.com.

Visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.

For more options, visit https://groups.google.com/groups/opt_out.