to_json performance

Originally posted on github, reported to the right place.

I would like to open a discussion about how `to_json` and `as_json`
operates in Rails from a performance standpoint.

I'm using Rails 3.2 but this issue applies to almost all versions of
Rails.

Our use case presents the challenge in sending out potentially large
JSON (or XML, but we'll focus on JSON rendering here) bodies as a
response of a API calls that runs through a normal Rails controller.

Someone states that if you are sending out more than 1MB of data
"you're doing it wrong".
I respectfully disagree, as there are different cases, based also on
what an application should do and is designed for.

There can be few major bottlenecks in rendering JSON as fast as
possibile from start to finish, three of them are:

* database query
* AR objects instantiation
* collection transformation in JSON

I'll exclude the AR timings, so let's tackle the first problem, AR
objects.

Having a result set of 50k entries is obviously painful, and doing it
with AR it will kill your instance, no questions asked.
Worse, in my use case this result set can be asked frequently from the
client.
This however can be easily solved by using a sapient mixture of
`caches_action`and `stale?`, although there may be few key issues when
using memcached and friends, because especially in managed
implementations, they may not support data keys larger than 1MB.

Since instantiating AR objects it's out of the question let's focus
and getting what we really want for our tests, an 'attribute' Hash to
convert to JSON.
Valium helps a bit here (it should be in AR core in my opinion), it
can selectively pick the columns you need and it will return back
values in an Array per every object, so we can have something like
this: `[['a', 'b', 'c'], ['a', 'c', 'd']]`

With little sorcery we can transform each Array representing the
requested record values in a Hash with keys mapped, like so:

    class Hash
      def self.transpose(keys, values)
        self[*keys.zip(values).flatten]
      end
    end

    keys = %W{token artist album genre release_year title duration
duration_in_seconds position play_count favorited disc_number}
    my_hash = @playlist.media_files.ordered.values_of(*keys).collect{ |
v> Hash.transpose(keys, v) }

Nice, now we have a Array of Hash without instantiating AR objects but
with all the info we need about our objects.
This operation proved to be _extremely_ fast even if we are
transforming and inserting 50k records.

Now, on to the rendering:

    render json: my_hash

This should try to call `to_json`on the Hash.

Since few Rails versions JSON encoding has been delegated to
`MultiJson`, which is ok.
Needless to say I'm using and requiring `yajl-ruby`, so I'm pretty
confident that MultiJson will pick it up as my engine, and in fact it
is:

    1.9.3p0 :001 > MultiJson.engine
     => MultiJson::Engines::Yajl

The key issue here is that with a large Hash the render action takes
36786ms to finish ( of which 8253.6ms spent for the database query,
that high because of multiple ORDER BY, that's another matter).

So I tried another way:

    render text: Yajl::Encoder.encode(my_hash)

Using `Yajl::Encoder.encode`(or `MultiJson.encode`for that matter)
yields another kind of result: 12614ms (8253.6ms always spent on the
database query)

Is there something we can do about this?

Note:
some may point out to try few gems like `rabl`or `acts_as_api`, but
don't forget they are mainly for presentation purpose and they are
comparable to the `to_json` performance in this regard.

Thanks for reading.

1\.9\.3p0 :001 > MultiJson\.engine
 => MultiJson::Engines::Yajl

The key issue here is that with a large Hash the render action takes
36786ms to finish ( of which 8253.6ms spent for the database query,
that high because of multiple ORDER BY, that's another matter).

So I tried another way:

render text: Yajl::Encoder\.encode\(my\_hash\)

Using `Yajl::Encoder.encode`(or `MultiJson.encode`for that matter)
yields another kind of result: 12614ms (8253.6ms always spent on the
database query)

Is there something we can do about this?

Skimming through the code in encoding.rb (in active support) it looks
like the to_json method rails adds ignores ActiveSupport::JSON.backend
and just does the encoding itself. I can only imagine this is because
for most people the bottleneck is parsing rather than generation.

Fred

"Valium helps a bit here (it should be in AR core in my opinion), it

can selectively pick the columns you need and it will return back

values in an Array per every object, so we can have something like

this: [['a', 'b', 'c'], ['a', 'c', 'd']] "

ActiveRecord now has ‘pluck’ which I think does the same.

http://guides.rubyonrails.org/active_record_querying.html#pluck

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/calculations.rb#L169