Optimization on huge generating xml?

Hi all,

Currently, I'm developing a rails app that are heavy generating xml from restful webservice. My xml representation of web service use nokogiri gem to generates xml format that match expected format from client. But the problem is data is quite big around 50, 000 records to pull out from the table(millions records). I just test in my local machine, it takes about 20 minutes to get the response from the request.

Do you any ideas on how to optimize this problem? I'm not sure if we don't use ActiveRecord, and we just use pure sql statement to pull out the data for generating xml, then the performance is huge faster or not?

Thanks, Samnang

Hi all,

Currently, I'm developing a rails app that are heavy generating xml from restful webservice. My xml representation of web service use nokogiri gem to generates xml format that match expected format from client. But the problem is data is quite big around 50, 000 records to pull out from the table(millions records). I just test in my local machine, it takes about 20 minutes to get the response from the request.

Do you any ideas on how to optimize this problem? I'm not sure if we don't use ActiveRecord, and we just use pure sql statement to pull out the data for generating xml, then the performance is huge faster or not?

Have you profiled your code to see where the bottleneck is?

Fred

With the kind of equation you are providing i.e. you have huge records to access, it will be better if you perform the 'pure sql query' to test. it might be a rare practice for others to test and testify.

Quoting Samnang <samnang.chhun@gmail.com>:

Hi all,

Currently, I'm developing a rails app that are heavy generating xml from restful webservice. My xml representation of web service use nokogiri gem to generates xml format that match expected format from client. But the problem is data is quite big around 50, 000 records to pull out from the table(millions records). I just test in my local machine, it takes about 20 minutes to get the response from the request.

Do you any ideas on how to optimize this problem? I'm not sure if we don't use ActiveRecord, and we just use pure sql statement to pull out the data for generating xml, then the performance is huge faster or not?

Using SQL and libxml2 (libxml-ruby gem) directly instead of ActiveRecord and Nokogiri (which calls libxml-ruby) will cut the run time. I would guess between 2x and 10x, if the code is written with speed in mind. And your code will be bigger and uglier.

What's cheaper, computer time or programmer time? How many times will this generation be run? And are there elapsed time constraints (e.g., an excellent 24 hour weather forecast that takes 28 hours to generate isn't useful).

Jeffrey

Does it need to be XML? JSON is much lighter and faster. You can also use page caching with REST, so subsequent request is just like Apache serving flat file. Maybe try to use some sort of compression too? I'm betting the bottleneck is getting the data over HTTP and loaded by the client, NOT AR getting it out of DB and building XML.

Chris, it has to be XML because I need to pass it directly to Adobe InDesign to place that data on document template. This is a book generation process, so it rarely runs. Like Jeffery mentioned above, maybe I can use pure xml and libxml2 to gain the speed just only this problem.

Thanks for all of your feedbacks.