One or two indexes? add_index :votes, [:item_id, :ip_address]

What's the difference between

add_index :votes, [:item_id, :ip_address]


add_index :votes, :item_id
add_index :votes, :ip_address

Not just in rails, but in the db as well?


For the first you’re adding the index on the item_id AND ip_address fields, together.

For the second you’re adding them on the fields individually.

So, like, when would I use one or the other? What's the benefit of
having an index on two columns, for example?

Sorry, I guess this is more an SQL question than a rails question.

add_index :votes, [:item_id, :ip_address]

Will create a single index on both columns, item_id first.

This will create an index with higher selectivity and is a better index
ONLY IF you always use both columns in a select query. With most
database engines you the index will also effectively work if you just
use item_id on its own in a select, but NOT using just ip_address.

add_index :votes, :item_id
add_index :votes, :ip_address

Will create two separate indexes and both may be used on their own in a
select query. Depending on the database engine it may be possible that
both are used in a select that uses both in the where clause. Combining
two separate indexes for a single query is highly dependent on the
database engine you are using, so check the documentation for your
database. Postgresql and MSSQL server can do it, not sure about MySql.


Stedwick wrote:

I can only tell you about MySQL, but as far as it goes there's a big difference.
MySQL will only ever use one index per table, so if you do

with 2 separate indices then mysql will pick the one it things is the best (most selective) and ignore the other one. With a combined index it will find things faster.
Note that if you have an index on [a,b,c,d] then that index can be used as an index for any left prefix (ie it can be used as an index on a, [a,b] but not on b alone.
It will also help you with things like Votes.find :all, :conditions => ["item_id = 123 AND created_at >= ? and created_at < ?"] (if you had a combined index on item_id and created_at). Similarly it will help mysql if you did a select of all Votes with a give item_id and then sorted by ip_address.

So a combined index can really help things out. The problem is that since order matters, there are a lot of possible combined indices: if you had 5 columns you were interested in, then you could have 5! (120) different combined indices which would probably be a bad idea, so you have to pick the ones that matter.


Excellent description. I got it now. Thx!

Another use of a multi-column index is the idea of a "covering" index,
which contains all the columns in a particular query. Only one or two
of the index columns are actually used for querying - the other
columns come along for the ride so that, having found the right index
entry, there's no need to go back to the table.