Is it possible to serialize an ActiveRecord object?

Daniel_Baez · December 3, 2019, 6:51pm

Hello,

Lets say that I have a model class like Car < ActiveRecord::Base and then, I want to extend it with an abstract class in between a sort of Car < CacheableActiveRecord::Base < ActiveRecord::Base

In CacheableActiveRecord::Base I’ll overwrite methods like find_by, find_by! and find with the intention of checking out Redis/Memcached before calling super and letting Rails resolve this to the underlying db, I’m using Rails 4.2.7.1

By extending Car an after_commit hook will be created for each instance of the Car class, that hook will clear the cached key for this model upon commit, This is to get the effect, of releasing the cache on save.

This is an implementation for my find method in CacheableActiveRecord::Base

def self.find(*ids)

expects_array = ids.first.kind_of?(Array)

return ids.first if expects_array && ids.first.empty?

ids = ids.flatten.compact.uniq

case ids.size

when 0

super # calling supper will generate an exception at this point

when 1

id = ids.first

result = get_from_cache(id)

unless result

result = super # I expect this to get the data from db

if result

result = set_to_cache(id, result)

end

end

expects_array ? [ result ] : result

else

super # for now let's just skip on multiple ids

end

rescue RangeError

raise RecordNotFound, "Couldn't find #{@klass.name} with an out of range ID"

end

I have tests for this, and they even seem to work. for the implementation of set_to_cache and get_from_cache I have the following code

def self.get_from_cache(entity_id)

return nil unless entity_id

cached = Rc.get!(get_cache_key(entity_id))

if cached

puts "cache hit"

end

return cached ? Marshal.load(cached) : nil

end

def self.set_to_cache(entity_id, entity)

return nil unless entity && entity_id

dump = Marshal.dump(entity)

Rc.set!(get_cache_key(entity_id), dump)

return Marshal.load(dump)

end

My doubts here are:

Is Marshal safe? can one do this? taking into account that this cache will be shared among instances of the same rails app, running in different servers.
Is it better to serialize to json? is it possible to serialize to json and then rebuilt an object that will work as regular active record object? I mean one on which you can call .where queries on related objects and so on? I’m a newbie to Rails and Ruby, started coding with both two weeks ago, my background is mostly java xD (finding Ruby great BY THE WAY hahaha)

Thank you

Daniel

Adam_Lassek · December 3, 2019, 8:27pm

It might be possible but I would strongly suggest rethinking this.

What is going wrong with your DB that its built-in caching is insufficient? Have you exhausted your performance-tuning options in the DB layer? Have you measured a significant performance problem in the first place? Do you really need to serialize the entire record or could you cache just what you need? Have you considered the difficulty of keeping two data stores in sync?

If you can confidently answer all of those questions and you still want to pursue this design, then I would begin again with a data-mapper pattern instead.

Daniel_Baez · December 3, 2019, 10:55pm

Hi Adam, thanks for answering

About the questions:

What is going wrong with your DB that its built-in caching is insufficient?
Load, there are millions of queries running a sec now, and the Active Model classes I want to update are read constantly but hardly ever written
Have you exhausted your performance-tuning options in the DB layer?
I’m not an expert on this, but I understand that yeah performance tuning only got that far
Have you measured a significant performance problem in the first place?
This investigation is looking at the future, I’ll need to serve 20times more traffic
Do you really need to serialize the entire record or could you cache just what you need?
I’m actually writing services in another language, that could fetch this data from the same cache, reads are all concentrated in my Rails app
So I’m actually looking forward to serialize a “row” to Thrift/Avro/Proto, save it somewhere, and then use it read it from there (redis/memcached) instead of issuing queries for data that doesn’t change
I’m after two goals
Improve performance of my rails monolith
Make this data available for other services
Have you considered the difficulty of keeping two data stores in sync?
I’ve done this in the past, exact same idea but on a custom MVC without ORM and it worked perfectly, improved performance, actually the system later on, faced a point by which cache invalidations became problematic: the system couldn’t operate any longer without it’s cache warm enough Besides these questions, given that I do want to persue a design like this
Can I plug this data-mapper anywhere in rails? Can I override a few methods and then delegate the rest back to the DB for a particular model?
The idea of using Marshal I got it from browsing Rails.cache code, where functions to serialize/deserialize can be defined, or they are defaulted to Marshal.load and Marshal.dump. Rails.cache documentation doesn’t say anything about not doing something like:

`class CarCache   `

`   def` `get_car_from_cache_or_db(id)`

`    ``Rails.cache.fetch(``"cars.#{id}"``) ``do`

`      ``Car.``find(id)`

`    ``end`

`  ``end`

`end

`

And then changing my application’s call sites to Car.find for CarCache.get_car_from_cache_or_db

Right?

San_Ji · December 4, 2019, 12:37am

Most of the time, as Adam suggested, the database would do this for you already.

But if it does not works, your table is just so large; it won't fit in the memory. (I guess)

If the situation allow, try adding more memory to the database instances. If the size of your data is not in the same ballpark as a server memory, congratulations, your business are great. In that case, I would try shading and using KV store instead of relational database.

I never face something at that scale; I guess it depends on application also.

Not much I can input here but you can try searching articles related to Twitter, they have the same problem in the past, and they were open about it, you should find their materials easily.

jimc · December 5, 2019, 1:28pm

Have you tested to be sure that Marshal.load is significantly faster than ActiveRecord instantiation from a query? Have you verified that the database queries are a significant portion of the time to handle the request?

Have you tried using Rails view caching? View caching also saves you the view rendering time, which is often the majority of the time spent handling a request. For example, here’s a typical page load from one of my apps: Completed 200 OK in 35ms (Views: 25.6ms | ActiveRecord: 4.3ms). At millions of requests per second, I’d gain almost nothing by reducing the ActiveRecord time from 4.3ms to 1ms, compared to using the various elements of view caching that are already well supported and documented.

Topic		Replies	Views
Serialize limitations? rubyonrails-talk	2	184	March 19, 2007
Automatic marshalling with DB access rubyonrails-talk	0	121	November 7, 2007
Saving *args to database rubyonrails-talk	0	61	October 4, 2007
Serialize object and store into database rubyonrails-talk	4	187	September 18, 2008
Avoiding repeated calls to the DB rubyonrails-talk	0	128	January 26, 2007

Is it possible to serialize an ActiveRecord object?

Related topics

More Resources