Avoiding duplicate queries for concurrent requests for an unpopulated cache key, I.e, Cache Stampede

Justin_Gordon · January 6, 2022, 9:07am

Does anybody know why Rails.cache.fetch does not support handling for cache stampedes?

Ruby on Rails supports a race_condition_ttl option useful for cache values that will expire. However, it doesn’t help with the case of the initial population of a cache key.

I’m working on an app with 20+ Heroku P-L dynos, thus running hundreds of concurrent puma threads. After deployment, many cache keys change to reflect a new release.

If 50 threads request the same cache key simultaneously, all 50 threads will invoke the same complicated queries, putting excessive load on the database.

Has anybody seen any solutions in the Rails community for this behavior? Would this be a beneficial addition to ActiveSupport::Cache.fetch?

The complicated part of the distributed lock can be handled using a library like https://github.com/dv/redis-semaphore/blob/master/lib/redis/semaphore.rb.

Maybe this is beyond the scope of default Rails? or perhaps it’s an excellent addition?

References

github.com

rails/rails/blob/main/activesupport/lib/active_support/cache.rb#L290-L301


      
          #           # decompression logic...
          #         end
          #       end
          #     end
          #
          #     ActiveSupport::Cache.lookup_store(:redis_cache_store, compressor: MyCompressor)
          #
          # [+:coder+]
          #   The coder for serializing and (optionally) compressing cache entries.
          #   Must respond to +dump+ and +load+.
          #
          #   The default coder composes the serializer and compressor, and includes

What is a cache stampede and how we solved it by writing our own gem

If you are not setting some sort of lock to re-calculate cache while using probabilistic early expiration, then you will end up with multiple processes hammering your cache and underlying systems computing and re-writing the same value.

KSH-code · January 7, 2022, 3:08am

What do you think cache warming? if you consider multiple requests simultaneously, it needs to warm up cache.