Performance of to_xml, versus to_json

I've been doing some benchmarks on how fast ActiveRecord's #to_xml is on large datasets, versus the speed of A ActiveRecord's #to_json, and the difference is rather alarming. #to_xml becomes quickly unusable for large datasets, while to_json remains tolerable.

The important thing here isn't how fast #to_json is -- it's that you can't practically use #to_xml anymore after a certain point, a low point.

Some basic benchmarks, with a User model that has 5 fields: an integer, a string, a text, and two datetimes.

Commands benchmarked:   to_xml - "User.find(:all).to_xml"   to_json - "User.find(:all).to_json"

50 items:   to_xml - ~0.20 seconds   to_json - ~0.02 seconds

500 items:   to_xml - ~6.9 seconds   to_json - ~0.4 seconds

1000 items:   to_xml - ~25 seconds   to_json - ~0.9 seconds

8000 items:   to_xml - >10 minutes (didn't wait for it finish)   to_json - 10.7 seconds

Can anybody comment on this? With Rails pushing ActiveResource and REST like crazy, services which really start allowing people to list their products and userbases in XML are going to have a tough time with out-of-the-box Rails, with performance like this.

-- Eric

Can anybody comment on this? With Rails pushing ActiveResource and REST like crazy, services which really start allowing people to list their products and userbases in XML are going to have a tough time with out-of-the-box Rails, with performance like this.

From your numbers, this isn't an issue for the vast majority of services. I'm not sure why anyone would send 8000 items in a request. But, if this concerns you, I'd be happy to commit some performance patches for it.

From your numbers, this isn't an issue for the vast majority of services. I'm not sure why anyone would send 8000 items in a request. But, if this concerns you, I'd be happy to commit some performance patches for it.

Hi Rick. Yeah, Eric's 8000 example is the extreme case, but any performance patches would be great, and we're willing to help in any way we can.

I was also wondering what to_json wasn't doing (besides the obvious :include stuff) that it should be doing to bring it on par with the to_xml code. We'll likely switch to json for our specific application, but it'd be good to know any issues before we dive in (granted, we're more than willing to fix any issues we find once we get in there).

thanks, -Chad

Eric's email sounded kind of like he was volunteering. And, I am a very willing receiver of such patches :slight_smile:

I've done a bit of json hacking using the fjson gem: http://svn.techno-weenie.net/projects/plugins/json_for_rails/

The big problem I ran into was worrying about dates and times. My plugin converts them all to xmlschema times. But, there's no way to tell the type of incoming objects, so I used a regex to cast xmlschema times back to Time objects. This could cause problems if someone posted an xmlschema time as the body of a post though. Perhaps a better solution is to assume *_at/*_on fields are times? I'm not sure. I'd like to stay consistent with other json API's, but from what I gathered they differed anyway. So, I shelved the plugin for now.