scaling full text indexing(ferret vs solr vs hyperstraier)


Does any have experience scaling full text search in RoR?

Right now our project is running a simple setup with ferret and acts_as_ferret. We are thinking about deploying a feature that would send 50x more search requests.

So we probably have to rethink our solution. How do services like (the former Summize) use?

Or in what direction should I look?

With Ferret you can scale reads horizontally: you can have multiple read servers on a single index. You can only have one write server on a single index or you'll risk data corruption.

Another strategy is partitioning: having separate indices for buckets of data. Each index could run on it's own server or cluster of servers.

Would it be easier to scale with hyperestraier or something else?

One option that worked very well for me is ultrasphinx.

IIRC, 2 limitations of ultrasphinx are:

* new entries can only be found after reindexing   (full reindexing or delta indexing) * you need a separate sphinx process somewhere on a server   (if you run a shared hosting system, this may be an issue)

If you can live with those 2 limitations, ultrasphinx is a very good candidate.