Approximate matches in searches

Hi I want to include a “Did you mean…” feature in my searches. So, if someone searches for “Tily and the Wall”, it will return “No matches”, but underneath, it’ll say “Did you mean ’ Tilly and the Wall’?”, which will link to the page about “Tilly and the Wall”. I’m not sure how to do this though. Is there a common practice for approximate searches in databases? A lot of sites do it, so it is possible to do practically. Any ideas?

-Nathan

Check out the SOUNDEX function for your preferred database (most modern databases support it, I believe). Here's the link to the MySQL page that mentions it. Scroll down about half way. You should just be able to include it in the :conditions part of your find().

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

njmacinnes@gmail.com wrote:

Ok, that's great, thanks. It'll do the job nicely. But if I make a typo in Google, for example, "helljo", it'll ask me if I meant "hello", which soundex doesn't. It doesn't really matter too much, because soundex will catch a lot of the errors, but I can't help thinking that if Google can do it, then why shouldn't I be able to?

-Nathan

Google has a huge amount of data on file about what people search for including what sequences of searches are made. They use this data to construct the did you mean link.

For example if a hundred people search for 'rubi on rhails' and then search for 'ruby on rails' right after that Google will learn that perhaps 'ruby on rails' is more correct and ask the next person who searches for 'rubi on rhails' if they meant 'ruby on rails'. This is a surprisingly dumb(complexity wise) algorithm. It works simply because Google processes so many searches.

This is how it was explained to me by some colleagues, I can't say for sure that this is what they do.

So in short, unless you are having a lot of queries by different people for the same stuff then you can not do what Google does.

Matthew Margolis blog.mattmargolis.net

njmacinnes@gmail.com wrote:

Ok, I see. That makes a lot of sense.

As a pont of interest, and a somewhat pointless exercise, it would be quite possible to use the information they've already compiled. Each unsuccessful query would be searched for in Google, then any DYM options given would be queried in the database to see if it brings up results.

I'm planning on saving all searches in the database anyway. If I decided later on that it would be possible and worth the effort to do what Google does, this information could be compiled at a later date from the saved searches.

-N