Find(:all) - lazy/eager

Philip Dutton wrote:

The reason I ask is because if I load the Employees into a collection
inside the script/console, then the console echos the collection to the
screen and you can see all the objects and all the "has_many" objects.
At this point, the system loaded everything.

I'm not sure I understand what you are doing here. What is the "has_many" objects you are referring to? You have an employee model:

class Employee < ActiveRecord::Base
end

If also said that you are trying to search based on a related object (in your other email). So I assume you have something like:

class Employee < ActiveRecord::Base
   has_many :tasks, :dependent => :destroy
end
class Task < ActiveRecord::Base
   belongs_to :employee
end

Now you want to load all employees that have high priority tasks. So you do:

@lazy_employees = Employee.find :all, :include => 'tasks',
   :conditions => ['tasks.priority = ?', 'high']

Now your complaint is that the tasks that meet the conditions are loaded with the above code which you feel is a bit wasteful? Am I understanding your problem correctly?

If that is the case there are a few things you can do:
1. Make sure it really matters. Does that extra loading of data actually cause a performance problems? If not don't worry about it. The above code is simple to understand. If it is an actual performance problem then proceed.

2. You could switch from using :include to :joins. This allows you to join without loading the actual related objects. So:

@lazy_employees = Employee.find :all,
   :joins => %q{INNER JOIN tasks ON
     tasks.employee_id = employees.id AND
     tasks.priority = 'high'}

You are removing the convienance of what :include provides (automatic joining and automatic object loading of the related object) but you gain more control. Your join is now a inner join instead of an outer join which is much more efficient. Also you are not loading those related objects.

3. If you want to keep the loading of related objects (for example if you use those objects in your code and want to avoid N+1 query problem) you can also improve the first attempt using the :select option so only the fields you actually use are loaded. For example:

@lazy_employees = Employee.find :all, :include => 'tasks',
   :select => 'employees.name, task.name, tasks.due_on',
   :conditions => ['tasks.priority = ?', 'high']

This limits you to only being able to print the employee name, task name and due date but it saves on the amount of data that has to be loaded.

4. If you don't care about related objects and only want certain fields you can combine step 2 and 3 to produce:

@lazy_employees = Employee.find :all,
   :select => 'employees.name',
   :joins => %q{INNER JOIN tasks ON
     tasks.employee_id = employees.id AND
     tasks.priority = 'high'}

Now you are only loading the data you need, using an INNER JOIN to reduce the load in SQL and not loading the related objects that you don't plan to print anything out about. But obviously now it is much uglier. If you really need the performance try to push the code down to the model level so your controllers and view remain easy to work with. See http://weblog.jamisbuck.org/2006/10/18/skinny-controller-fat-model for more info on this process.

In the end everything resolves to find_by_sql. Depending on your needs you might be able to operate at a very high level like the first attempt. Or you may need to operate almost at the SQL-level like the last attempt. You always still have the option of operating completely at sql if you need absolute control. Determining at what level to operate at is determined by a balance of performance and code readability. I have found in practice I rarely use actual SQL code. Most my my queries look like:

Employee.find_by_last_name('Anderson')
Employee.find :all, :include => :departments,
   :conditions => ['departments.name = ?', 'IT']

But every app will have a few queries that get closer to SQL. I have developed around 10 Rails applications thus far and never had a need for find_by_sql although I don't rule it out.

ActiveRecord is a thin wrapper around the relational model. There are other ORMs that operate at a higher level although I find that they often don't give you the low-level access that you occasionally need. I think Rails reaches a nice compromise of being a thin wrapper with some high-level features (such as dynamic finders and :include), but also providing complete low-level access.

Eric

Philip Dutton wrote:

Wow, Eric. You nailed everything in place. That last paragraph you wrote is the gem of information I was looking for. I appreciate your very generous response. I have read the Agile Rails book. Sadly the book fails to re-iterate the point you made. Also, it is easy to forget (as I did) that everything eventually boils down to find_by_sql.

In addition to the article on skinny controller, fat model Jamis also has some excellent "under the hood" articles.

http://weblog.jamisbuck.org/under-the-hood

The articles about ActiveRecord::Base.find you might find to be useful.

Eric