Approximate matches in searches

Hi
I want to include a “Did you mean…” feature in my searches. So, if someone searches for “Tily and the Wall”, it will return “No matches”, but underneath, it’ll say “Did you mean ’
Tilly and the Wall’?”, which will link to the page about “Tilly and the Wall”. I’m not sure how to do this though. Is there a common practice for approximate searches in databases? A lot of sites do it, so it is possible to do practically. Any ideas?

-Nathan

Check out the SOUNDEX function for your preferred database (most modern
databases support it, I believe). Here's the link to the MySQL page
that mentions it. Scroll down about half way. You should just be able
to include it in the :conditions part of your find().

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

njmacinnes@gmail.com wrote:

Ok, that's great, thanks. It'll do the job nicely. But if I make a
typo in Google, for example, "helljo", it'll ask me if I meant
"hello", which soundex doesn't. It doesn't really matter too much,
because soundex will catch a lot of the errors, but I can't help
thinking that if Google can do it, then why shouldn't I be able to?

-Nathan

Google has a huge amount of data on file about what people search for
including what sequences of searches are made. They use this data to
construct the did you mean link.

For example if a hundred people search for 'rubi on rhails' and then
search for 'ruby on rails' right after that Google will learn that
perhaps 'ruby on rails' is more correct and ask the next person who
searches for 'rubi on rhails' if they meant 'ruby on rails'. This is a
surprisingly dumb(complexity wise) algorithm. It works simply because
Google processes so many searches.

This is how it was explained to me by some colleagues, I can't say for
sure that this is what they do.

So in short, unless you are having a lot of queries by different people
for the same stuff then you can not do what Google does.

Matthew Margolis
blog.mattmargolis.net

njmacinnes@gmail.com wrote:

Ok, I see. That makes a lot of sense.

As a pont of interest, and a somewhat pointless exercise, it would be
quite possible to use the information they've already compiled. Each
unsuccessful query would be searched for in Google, then any DYM
options given would be queried in the database to see if it brings up
results.

I'm planning on saving all searches in the database anyway. If I
decided later on that it would be possible and worth the effort to do
what Google does, this information could be compiled at a later date
from the saved searches.

-N