Someone responded to me off list about this issue, so I thought I would update this thread with what I ended up having to do.
You can't use a solution like memcached to store this type of data, since it is not serializable.
The best solution that I could come up with was an in-memory cache. The problem with this of course, is that Rails apps. are running in separate Ruby processes, so implementing an in-memory cache immediate implies that each process has its own cache (this is why Rails is into "shared-nothing"). Obviously, if you want the state of these stored objects to be represented consistently in the app., then you have to figure out how to manage the caches which may or may not be present across all running Ruby processes.
If each Ruby process has its own cache, then there is a significant probability that you may hit process A on one request, establish a cache entry, then hit process B on another request, and have to regenerate the same cache entry, etc. So this approach may only make sense when the number of potential cache requests is > the number of Ruby processes on your back end. Then, worst case, you only do your expensive cache-entry generation N times (where N is the number of Ruby processes) to service some X (where X > N) number of requests. Of course, perhaps the same process gets hit on every request and you reap the maximum benefit from one in-process cache (1 cache-entry generation for X requests).
Here's what I ended up with:
I created a custom Cache object show below:
class WesCache::Cache < Hash REFRESH_TIME_KEY_PART = "_last_refresh_time"
def needs_refreshing?(key, time_to_refresh) self.refresh_time(key) < time_to_refresh end
def refresh_time(key) self.set_refresh_time(key, Time.now) unless self.has_key?("#{key}#{REFRESH_TIME_KEY_PART}") self["#{key}#{REFRESH_TIME_KEY_PART}"] end
def set_refresh_time(key, refresh_time) self["#{key}#{REFRESH_TIME_KEY_PART}"] = refresh_time end
def delete(key) super("#{key}#{REFRESH_TIME_KEY_PART}") super end end
The key values embed the session id somewhere so that each in-process cache may be holding data related to any number of sessions. This cache also holds a key within itself (also implicitly on a per session basis) that represents the last time that this local cache values was refreshed (which in my case, means deleted - my objects are read-only so they either exist or they don't, so "refereshing" doesn't mean update, it means removed).
Then there is a concept of "global last refresh time" which is managed _globally_ for the application. The unified value of the last refresh time for a given _cache and key within it_ is stored in a memcache. So, to summarize, each Ruby process has its own "smart hash" cache that can keep track of when the last refresh was done for _a given key_ for itself. Then, it can compare the local refresh time against the global refresh time and know to remove its local entry (and thus cause it to be regenerated by the caller).
The object that makes use of all of this is the CacheManager - in retrospect, I might have moved all of this logic into the Cache object itself. I might refactor this in the future.
Here's the CacheManager:
require 'wes_cache/cache'
#The refresh times are stored in the _memcache_ cache which is referred to through "Cache". #DO NOT CONFUSE our caches (which are WesCaches) with the memcache. #The parse_tree_cache and the list_data_cache are WesCaches (local to the Ruby process). #The parse_tree_refresh_times and the list_data_refresh_times are memcaches ("global" to all Ruby processes). class CacheManager @@logger = RAILS_DEFAULT_LOGGER
#Local process based caches (effectively hashes) @@parse_tree_cache = WesCache::Cache.new @@list_data_cache = WesCache::Cache.new
#"Global" memcaches (for access by any process) Cache.put("parse_tree_refresh_times", Hash.new) if Cache.get("parse_tree_refresh_times").nil? Cache.put("list_data_refresh_times", Hash.new) if Cache.get("list_data_refresh_times").nil?
def self.parse_tree_cache(key) get_cache("parse_tree_refresh_times", @@parse_tree_cache, key) end
def self.remove_from_parse_tree_cache(key) remove_from_cache("parse_tree_refresh_times", @@parse_tree_cache, key) end
def self.list_data_cache(key) get_cache("list_data_refresh_times", @@list_data_cache, key) end
def self.remove_from_list_data_cache(key) remove_from_cache("list_data_refresh_times", @@list_data_cache, key) end
private def self.remove_from_cache(refresh_times_cache_name, cache, key) refresh_times_cache = Cache.get(refresh_times_cache_name) refresh_times_cache[key] = Time.now Cache.put(refresh_times_cache_name, refresh_times_cache) cache.delete(key) end
def self.get_cache(refresh_times_cache_name, cache, key) refresh_times_cache = Cache.get(refresh_times_cache_name) last_global_refresh_time = refresh_times_cache[key] || Time.now @@logger.debug("\tLast global refresh time is: #{last_global_refresh_time}") @@logger.debug("\tLast time this cache was refreshed: #{cache.refresh_time(key)}") if cache.needs_refreshing?(key, last_global_refresh_time) @@logger.info("Need to refresh list data cache for key #{key}") cache.set_refresh_time(key, Time.now) cache.delete(key) else @@logger.info("Don't need to refresh list data cache for key #{key}") end
cache end end
I realize that all of this may be confusing. If anyone find it useful, I'm happy to answer any questions.
Wes