I occasionally need to stream a large XML data file that represents key data in a DB. I'm porting over an application from PHP Symfony, and with my initial implementation, it takes around 7 times as long with rails. Also with Symfony, data begins to download almost as soon as I invoke the URL, whereas with rails, all data is processed on the server side before the client gets the first byte. I have a hand- crafted query to hit the DB, and use fetch_hash to use the raw data from the mysql gem and that renders extremely quickly. Also I've tried to write a tiny subset of XML while reading the entire resultSet; with that I get much faster performance, but of course that way the XML doesn't come.
I spent most of the past weekend trying to determine how to optimize this (hoping to do at least as well as PHP symfony) but can't do it.
I tried: - used render :text => (lambda do |response, output| ... ) - ruby 1.8.7 vs. ruby 1.9.2 - rails 2.3.5 vs. rails 3 - XmlBuilder vs. Nokogiri::XML::Builder - HAML vs ERB - passenger vs. script/server Nothing honestly moved the performance needle in a serious way.
I've finally come to the conclusion that rails does not stream out as I'd expect. Here's a look at the perf stats rendered as the request runs:
Rendered hgrants/_request_detail (2.2ms) Rendered hgrants/_request_detail (3.9ms) Rendered hgrants/_request_detail (2.4ms) Rendered hgrants/_request_detail (2.3ms) Rendered hgrants/_request_detail (242.7ms) Rendered hgrants/_request_detail (2.2ms) Rendered hgrants/_request_detail (1.9ms) Rendered hgrants/_request_detail (1.8ms)
We went from an average 2ms up to 242ms then back down. I saw this sporadically throughout the 1000 template renderings That suggests to me that memory is getting garbage collected. Also, I'm invoking the request from curl, and it reports no data downloaded until after my logfile tells me rails has finished processing all records in the view. The model IDs that result in the over-sized ms count vary from one request to another, so I'm convinced there is nothing in the app that is doing this. I even tested this by removing the call to the HAML template and replacing it with a block of generated text and observed similar behavior.
This is how I'm invoking HAML from the XML Builder template: xml << render(:partial => 'hgrants/ request_detail.html.haml', :locals => { :model => model })
I also tried using this trick to try to get it to stream, but I observed exactly the same behavior; no data showed up in curl until all records had been processed. render :text => (lambda do |response, output| extend ApplicationHelper
xml = Builder::XmlMarkup.new( :target => StreamingOutputWrapper.new(output), :indent => 2) eval(default_template.source, binding, default_template.path) end)
(Also, in rails 3, the render :text with a Proc, rails 3 renders the Proc as a to_str rather than calling it.)
This particular issue I can certainly work around but it's disappointing if it's true that there's no way in rails to stream output to the browser for large pages. And particularly disappointing if PHP/Symfony can outgun rails for streaming. I've been using rails since 2006 and most requests have fairly small responses so maybe the answer is to defer to a different technology for streaming larger files. But it seems like there should be a good solution for streaming data and flushing the output stream.
Any help is greatly appreciated! Eric