Serialize AR scope?

Is there a way to serialise and deserialise ActiveRecord scope? Most likely not, so the next question will be “where should I look into AR guts to build the behaviour I need?”

There’s, of course, .to_sql, but how do I “deserialise” that SQL string back into useful data?..

I had a similar need although the specific need was for sending a relation to an ActiveJob. Sharing it even if that is not your goal as it might be a useful pointer in the right direction or something you can build on.

The key thing for me was limiting the scope of what can be serialized. There is a method called where_values_hash which returns the where conditions hash. The caveat is it is a public method although nodoc so that makes is pseudo-private.

Since it just provides the where hash this means it cannot serialize things like a joins, includes or even where.not. Also SQL fragments won’t work such as where('name ILIKE ?', 'John%'). But it does still work if you have multiple where conditions such as in a method chain that is built up: Widget.where(foo: 'bar').where(baz: 'boo').

Most of the time this is fine and even if not I can usually work around the limitation in some way. Here is the ActiveJob serializer I wrote:

class ActiveJob::RelationSerializer < ActiveJob::Serializers::ObjectSerializer
  def serialize? argument
    super && where_hash_only?(argument)
  end

  def serialize scope
    super klass: scope.klass, conditions: scope.where_values_hash
  end

  def deserialize serialized
    serialized[:klass].where serialized[:conditions]
  end

  private

  def where_hash_only? scope
    scope.to_sql == scope.klass.where(scope.where_values_hash).to_sql
  end

  def klass = ActiveRecord::Relation
end

Note the check on serialization to see if the scope violates any of my limitations. I take the scope being serialized and then try to re-created it with the where_values_hash. If the SQL string doesn’t match then that means someone tried to serialize a relation that was more complex than this allowed.


If all this is too limiting then my next strategy is just to pluck the ids. Example:

ids = Widget.some_complex_scope.pluck(:id)

Then later when I need the records in that scope I can:

Widget.where id: ids

The caveat here is the records in that set may change between the time it was seralized and the time it was deserialized. Also anything like a pre-load (includes, etc) won’t happen since those are secondary queries.

There is also the find_by_sql which is fully public and around since the early days of Rails, so pretty solid. It wouldn’t be nearly so limited. joins, order, where.not, sql fragments all should work (pre-loads would not still).

The main reason I didn’t do that to “deserialize” the SQL string is that you don’t get a relation but instead an array meaning you can’t then chain from that relation after deserializing. For example, the jobs that were processing my relation might be a relation of thousands of records. There is was a main job that just paginated and spun of child jobs for each batch. With find_by_sql I couldn’t add on that pagination.

But if you are ok with not getting back a relation but instead an array that might be a solution for you.

Thanks! I like your idea. :slight_smile:

My approach so far is pretty basic:

scope = SomeModel.active.last_month.with_tasks.where(foo: "bar", baz: "baka", tasks: {active: true})

# there
scope_as_string = Base64.urlsafe_encode64(Zlib::Deflate.deflate(Marshal.dump(scope)))

# and back again
skope = Marshal.restore(Zlib::Inflate.inflate(Base64.urlsafe_decode64(scope_as_string)))

skope == scope #=> true

Which results in a url-safe string about ~1kb in size (zlib added because it’s getting close to ~2kb without compressing, and 2kb is the “safe limit” for url strings). It ain’t pretty, but it does the job.

I was thinking about something more compact, but, it seems there’s no easy way to get there other othan implementing a full-fledged (de)serialiser yourself. Or, making a factory class that’d take a hash as an argument and convert it to a relation, buuut… that ain’t handy at all :confused: It just doesn’t feel rails’y. :smiley:

Stumbled on this because at my company we also found ourselves wanting to serialize and deserialize AR scopes. Marshal does seem to work (haven’t tested it extensively but so far so good).

Introducing ActiveRecord::Relation#serialize and ActiveRecord::Relation.deserialize might be a cool PR to offer for Rails (and maybe just a fine a standalone gem if Rails core team doesn’t want it). Aside from where_values_hash, what else would we need to capture everything? joins, includes, eager_loads, selects, aggregations … what else?

You kinda mentioned group_values when you said “aggregations” … so above and beyond all of that I think just distinct_value and order_values. Oh – and when it comes to JOINs, note that there is a separate left_outer_joins_values above and beyond just joins_values.

1 Like

I believe at this point the easiest approach would be to parse the SQL back into a relation, tbh. The code generating SQL is so convoluted it’s crazy. I guess it just never came to anyone’s mind that a need might arise to do things in reverse, or to produce anything other than SQL :man_shrugging:

parse the SQL back into a relation

To me this seems fraught with pitfalls – your solution would need to understand JOINs and correlation names and really be pretty smart. Perhaps you could leverage the cryodex/sql-parser, and still that only gets you to its idea of a standardised tree of things.

Seems to me that the simpler thing would be to pursue what we’ve described so far – serializing the various arrays and hashes which compose an AR object.

1 Like

One approach is to serialize the conditions of the scope into a format such as JSON or YAML, and then deserialize them back into Ruby objects to rebuild the scope. Here’s a simplified example of how you might do this:

class MyModel < ApplicationRecord scope :active, → { where(active: true) } end

Serialize the scope conditions

serialized_scope = MyModel.active.where_values_hash.to_json

# Deserialize the scope conditions conditions_hash = JSON.parse(serialized_scope) reconstructed_scope = MyModel.where(conditions_hash)

Now you can use the reconstructed scope as needed

This approach works for basic cases but may not handle more complex scopes that involve joins, includes, or other advanced ActiveRecord features.