How can I preload an association with conditions of a collection?

Trying to preload an association of a collection, and that association may or may not have conditions. For example!

class User
  has_many :documents
end

# has a string column document_type
class Document
  belongs_to :user
end

If I run

users = User.all.includes(:documents)
# Note, I cannot do User.all.includes(:documents)
#                       .where(documents: { document_type: pdf })
# this just changes the query, I want ALL users.
users.each do |user|
  puts user.documents.where(document_type: "PDF") # N+1
end

This leads to an N+1 query though.

I understand I can define a new association

class User
  has_many :pdf_documents, -> { where(document_type: "PDF") }
end
users = User.all.includes(:pdf_documents) # Not an N+1 issue

But, for my use case. This isn’t really tenable.

I also can access the preloader directly (apologies for formatting)

users = User.all
ActiveRecord::Associations::Preloader.new.preload(users, 
:documents, Document.where(document_type: "PDF"))
users.each { _1.documents.each { _1.document_type } } # Not an N+1

But, this also leads to problems, since 1. Using an internal object (subject to change?). 2. Framework doesn’t recognize the association as loaded. I have not looked into this problem, and could maybe patch that as well, but I don’t feel confident about stability.

I’m wondering if I’m missing something? If not, happy to change this thread to a feature request. Ideas I’ve had for an interface would be something like

# This doesn't work with nested includes.
User.all.includes(documents: -> { where(document_type: "PDF") })

# Okay
User.all
       .includes(documents: { florghs: :blorghs })
       .association_filters(documents: -> { where(document_type: "PDF") },
                            florghs: -> { where.not(bar: "BAZ") })
       #blorghs have no association filter

I haven’t dove into feasibility, just looking for help. Thanks!

Heya Ben – great to have you along with us on the Rails discussion board.

Have played around with your models and some sample data, and started writing up a bunch of stuff, but quickly realised it would make more sense as a screencast. So here’s a video that explores using .includes() to do eager loading with filtering in the way you describe:

The same steps and code used in the video:

1. Here’s a Postgres script to create and populate four tables with a small amount of sample data:

CREATE TABLE users (
  id BIGSERIAL PRIMARY KEY,
  first VARCHAR,
  last VARCHAR
);
INSERT INTO users (first, last) VALUES
  ('Bart', 'Simpson'),
  ('Lisa', 'Simpson');

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  name VARCHAR,
  document_type VARCHAR,
  esoteric_info VARCHAR,
  user_id BIGINT REFERENCES users(id)
);
INSERT INTO documents (name, document_type, esoteric_info, user_id) VALUES
  ('Spy plans', 'PDF', 'Deliver to Milhouse', 1),
  ('Gadget costs', 'XLSX', 'Budget is $20', 1);

-- Add an N:M association to blorghs
CREATE TABLE blorghs (
  id BIGSERIAL PRIMARY KEY,
  name VARCHAR,
  bar VARCHAR
);
INSERT INTO blorghs (name, bar) VALUES ('Taco', 'BAZ'), ('Pizza', 'BUZZ');

-- The associative table to go between documents and blorghs:
CREATE TABLE florghs (
  id BIGSERIAL PRIMARY KEY,
  document_id BIGINT REFERENCES documents(id),
  blorgh_id BIGINT REFERENCES blorghs(id)
);

2. Now with this data then in a Rails app with your models we can use .includes() with .references() and ActiveRecord will make a query that eager loads in a way that includes all users because it uses LEFT OUTER JOIN:

u = User.includes(:documents).references(:documents)

3. To filter on only PDF documents, it seems that you’d want just something like this, but note that even with the LEFT OUTER JOIN in place you get back only users who have PDF documents, so just one is returned:

u = User.includes(:documents).references(:documents)
        .where('documents.document_type' => 'PDF')

4. But if you filter on document_type of either PDF or NULL (and if all documents have a valid type set), then you do get everyone back – even folks with no documents, plus for documents it will only be the PDFs:

u = User.includes(:documents).references(:documents)
        .where('documents.document_type' => ['PDF', nil])

This kind of approach might give you all the tricks you need to do what you had originally asked for with pure ActiveRecord. But there is one more thing I’d like to demonstrate!

5. One current limitation of eager loading in Rails is that if you want to select only specific columns, that is only possible in the first “root” model, and not in other included models. Check out the SQL query that this builds:

u = User.includes(:documents).references(:documents)
        .select(:first, 'documents.document_type')

The first column on users is the only non-key column that is brought back, but on documents, all columns are returned, even though we had only asked for document_type.

6. To get only the columns you want, you can use The Brick gem, which adds some smarts to .includes(). With this in place then for any query which does eager loading, ActiveRecord looks for the special value :_brick_eager_load as being the first member in a .select(), and if that’s found then it does properly filter on columns of your choosing. Here is the same example as before to show just first names and the document type, and omit showing anything else, in this case the esoteric_info column:

u = User.includes(:documents).references(:documents)
        .select(:_brick_eager_load, :first, 'documents.document_type')

7. Another example that uses the blorghs table, note that documents can JOIN to it when that model has the has_many :blorghs through: :florghs association:

u = User.eager_load(documents: :blorghs)
        .select(:_brick_eager_load, :first, 'documents.document_type', 'blorghs.bar')

Personally I’d love to see this kind of column selection for eager loading thing become a built-in part of ActiveRecord. In one of my projects where lots of tables gets JOINed then these Brick enhancements speed up some of the tricky queries pretty well. May put together a PR for this.

All the best for your Rails endeavours,

-Lorin

2 Likes

Thanks for the warm welcome Lorin.

I was missing the references method (had no idea it existed!). Thanks for the thoughtful answer, really appreciate it :slight_smile:

Also, Brick gem is really cool. Will be diving into it’s internals later this week, thanks for pointing it out as well!

1 Like