ordering find() queries by non-database properties

Hi all,

I'm having a little problem with find() methods in a project I'm working on. Basically I'm working on a time management system that assigns each task a priority (an integer from 1 to 10) based on various factors (time until deadline, whether other tasks are dependent upon it), and I need to order a list of the tasks by their priority.

The 'priority' of a task isn't actually a database column - it's calculated by the following method in my Task model:

Tim,

It looks like the priority calculation just manipulates the time_left such that the incomplete task with the largest time_left has the highest priority. Why can't you just order by time_left?

Alternatively, you could add a priority column to your table and then use a callback like before_save to ensure the priorities are kept up to date. That's going to be vastly more efficient when you are displaying a list of tasks.

Alternatively #2, the one unknown in your priority method is highest time_left value for incomplete tasks. You could store just this value as a class variable making the priority calculation trivial. For this method you need a way to initialize the class variable the first time its accessed, then keep it up to date with a callback method, use sort by time_left for your SQL queries, and then call the slimmed down priority method at display time.

Aaron

Hey Aaron,

It looks like the priority calculation just manipulates the time_left such that the incomplete task with the largest time_left has the highest priority. Why can't you just order by time_left?

Hmmm... the trouble here is that time_left is also a method of the Task class, rather than a database column.

Alternatively, you could add a priority column to your table and then use a callback like before_save to ensure the priorities are kept up to date. That's going to be vastly more efficient when you are displaying a list of tasks.

This makes a lot of sense, actually - there's a fair few methods of the task class that dynamically calculate values based on database values in this way, so caching them in the database would probably be sensible. The trick in that case is writing them back to the database in such an order that they are kept up to date (if priority and time_left are both calculated at run-time, and both need writing back to the database, but priority is calculated based upon time_left, I need to make sure that time_left is calculated and written back to the database first, presumably). Also, would this not have a pretty significant efficiency cost (with a non-trivial number of tasks) when saving records, as the priority and time_left of all the database records would need to be updated en masse whenever one record was saved?

Thanks for your help!

Cheers,

Tim

Also, would this not have a pretty significant efficiency cost (with a non-trivial number of tasks) when saving records, as the priority and time_left of all the database records would need to be updated en masse whenever one record was saved?

The answer really depends on your application. The choice is between calculate on write, or calculate on read. Will you be reading more often than writing, or writing more often than reading? The priority method in your original post is going to be really inefficient if you are displaying a page with multiple records. The call to priority for every task will do a find of all incomplete tasks and search through them. With the mass update you can do a single find, update the records, and write them back out.

With either method I would look for ways to make the calculation less painful. For example, maybe you can save the highest time so you don't always have to search for it.

Aaron

Dude seriously look at moving to Postgresql and writing a view in SQL to deal with this, it'll be a ton faster than getting rails to do it. I love Ruby but Active Record ain't the most efficient beast in the world.

If there is one thing databases are really good at it's storing and retrieving data. I'd definatly be using a view for that, and simply then using a find_by_sql to lookup on that view. Much cleaning in your controller/model code and much faster to retrieve as well.

Cam