Is it possible to serialize an ActiveRecord object?

Hello,

Lets say that I have a model class like Car < ActiveRecord::Base and then, I want to extend it with an abstract class in between a sort of Car < CacheableActiveRecord::Base < ActiveRecord::Base

In CacheableActiveRecord::Base I’ll overwrite methods like find_by, find_by! and find with the intention of checking out Redis/Memcached before calling super and letting Rails resolve this to the underlying db, I’m using Rails 4.2.7.1

By extending Car an after_commit hook will be created for each instance of the Car class, that hook will clear the cached key for this model upon commit, This is to get the effect, of releasing the cache on save.

This is an implementation for my find method in CacheableActiveRecord::Base

def self.find(*ids)

expects_array = ids.first.kind_of?(Array)

return ids.first if expects_array && ids.first.empty?

ids = ids.flatten.compact.uniq

case ids.size

when 0

super # calling supper will generate an exception at this point

when 1

id = ids.first

result = get_from_cache(id)

unless result

result = super # I expect this to get the data from db

if result

result = set_to_cache(id, result)

end

end

expects_array ? [ result ] : result

else

super # for now let's just skip on multiple ids

end

rescue RangeError

raise RecordNotFound, "Couldn't find #{@klass.name} with an out of range ID"

end

I have tests for this, and they even seem to work. for the implementation of set_to_cache and get_from_cache I have the following code

def self.get_from_cache(entity_id)

return nil unless entity_id

cached = Rc.get!(get_cache_key(entity_id))

if cached

puts "cache hit"

end

return cached ? Marshal.load(cached) : nil

end

def self.set_to_cache(entity_id, entity)

return nil unless entity && entity_id

dump = Marshal.dump(entity)

Rc.set!(get_cache_key(entity_id), dump)

return Marshal.load(dump)

end

My doubts here are:

  • Is Marshal safe? can one do this? taking into account that this cache will be shared among instances of the same rails app, running in different servers.
  • Is it better to serialize to json? is it possible to serialize to json and then rebuilt an object that will work as regular active record object? I mean one on which you can call .where queries on related objects and so on? I’m a newbie to Rails and Ruby, started coding with both two weeks ago, my background is mostly java xD (finding Ruby great BY THE WAY hahaha)

Thank you

Daniel

It might be possible but I would strongly suggest rethinking this.

What is going wrong with your DB that its built-in caching is insufficient? Have you exhausted your performance-tuning options in the DB layer? Have you measured a significant performance problem in the first place? Do you really need to serialize the entire record or could you cache just what you need? Have you considered the difficulty of keeping two data stores in sync?

If you can confidently answer all of those questions and you still want to pursue this design, then I would begin again with a data-mapper pattern instead.

Hi Adam, thanks for answering

About the questions:

  • What is going wrong with your DB that its built-in caching is insufficient?

  • Load, there are millions of queries running a sec now, and the Active Model classes I want to update are read constantly but hardly ever written

  • Have you exhausted your performance-tuning options in the DB layer?

  • I’m not an expert on this, but I understand that yeah performance tuning only got that far

  • Have you measured a significant performance problem in the first place?

  • This investigation is looking at the future, I’ll need to serve 20times more traffic

  • Do you really need to serialize the entire record or could you cache just what you need?

  • I’m actually writing services in another language, that could fetch this data from the same cache, reads are all concentrated in my Rails app

  • So I’m actually looking forward to serialize a “row” to Thrift/Avro/Proto, save it somewhere, and then use it read it from there (redis/memcached) instead of issuing queries for data that doesn’t change

  • I’m after two goals

  • Improve performance of my rails monolith

  • Make this data available for other services

  • Have you considered the difficulty of keeping two data stores in sync?

  • I’ve done this in the past, exact same idea but on a custom MVC without ORM and it worked perfectly, improved performance, actually the system later on, faced a point by which cache invalidations became problematic: the system couldn’t operate any longer without it’s cache warm enough Besides these questions, given that I do want to persue a design like this

  • Can I plug this data-mapper anywhere in rails? Can I override a few methods and then delegate the rest back to the DB for a particular model?

  • The idea of using Marshal I got it from browsing Rails.cache code, where functions to serialize/deserialize can be defined, or they are defaulted to Marshal.load and Marshal.dump. Rails.cache documentation doesn’t say anything about not doing something like:

`class CarCache   `

`   def` `get_car_from_cache_or_db(id)`

`    ``Rails.cache.fetch(``"cars.#{id}"``) ``do`

`      ``Car.``find(id)`

`    ``end`

`  ``end`

`end

`

And then changing my application’s call sites to Car.find for CarCache.get_car_from_cache_or_db

Right?

Most of the time, as Adam suggested, the database would do this for you already.

But if it does not works, your table is just so large; it won't fit in the memory. (I guess)

If the situation allow, try adding more memory to the database instances. If the size of your data is not in the same ballpark as a server memory, congratulations, your business are great. In that case, I would try shading and using KV store instead of relational database.

I never face something at that scale; I guess it depends on application also.

Not much I can input here but you can try searching articles related to Twitter, they have the same problem in the past, and they were open about it, you should find their materials easily.

Have you tested to be sure that Marshal.load is significantly faster than ActiveRecord instantiation from a query? Have you verified that the database queries are a significant portion of the time to handle the request?

Have you tried using Rails view caching? View caching also saves you the view rendering time, which is often the majority of the time spent handling a request. For example, here’s a typical page load from one of my apps: Completed 200 OK in 35ms (Views: 25.6ms | ActiveRecord: 4.3ms). At millions of requests per second, I’d gain almost nothing by reducing the ActiveRecord time from 4.3ms to 1ms, compared to using the various elements of view caching that are already well supported and documented.