Memcached would be great for this. You could even simply store the
synonym list for every possible word, which is of course very
inefficient from a storage point of view, but then again, all caching
is by definition.
Marc CloudCache.net
Memcached would be great for this. You could even simply store the
synonym list for every possible word, which is of course very
inefficient from a storage point of view, but then again, all caching
is by definition.
Marc CloudCache.net
Marc Byrd wrote:
Memcached would be great for this. You could even simply store the synonym list for every possible word, which is of course very inefficient from a storage point of view, but then again, all caching is by definition.
Marc CloudCache.net
Sent from my iPhone
On Apr 19, 2008, at 2:45 PM, Mike Laurence <rails-mailing-list@andreas-s.net
That would be pretty speedy. One issue - the thesaurus data is about 12 MB per language, so if many languages are available, that could be hundreds of MB of RAM tied up. Not terrible, but not ideal.
Do you see any issues with the Apache model I mentioned above? I don't have much experience with Apache, so I'm unsure if there would be performance issues to due large numbers of folders/files in the paths.
Thanks!
Mike
Some sizing estimates:
Number of words in a good dictionary: 1M
Average length of word: 8 bytes
Average number of words in thesaurus for each word: 30
Size of memcached “exploded” thesaurus for each language: 256 MB
Cost of a 1.7 GB machine on EC2: $65/month.
Serving up thesaurus results fast enough for AJAX: priceless ;^)
Cheers,
Marc CloudCache.net
Mike,
My opinion:
1- Cache is way faster than the file system 2- Once it is cached, it doesn't matter if it comes from the file system or the database 3- Managing your thesaurus in file system could become a big mess
So, I would definitely go for DB + memcached.
Cheers, Sazima