Optimization on huge generating xml?

Hi all,

Currently, I'm developing a rails app that are heavy generating xml
from restful webservice. My xml representation of web service use
nokogiri gem to generates xml format that match expected format from
client. But the problem is data is quite big around 50, 000 records to
pull out from the table(millions records). I just test in my local
machine, it takes about 20 minutes to get the response from the
request.

Do you any ideas on how to optimize this problem? I'm not sure if we
don't use ActiveRecord, and we just use pure sql statement to pull out
the data for generating xml, then the performance is huge faster or
not?

Thanks,
Samnang

Hi all,

Currently, I'm developing a rails app that are heavy generating xml
from restful webservice. My xml representation of web service use
nokogiri gem to generates xml format that match expected format from
client. But the problem is data is quite big around 50, 000 records to
pull out from the table(millions records). I just test in my local
machine, it takes about 20 minutes to get the response from the
request.

Do you any ideas on how to optimize this problem? I'm not sure if we
don't use ActiveRecord, and we just use pure sql statement to pull out
the data for generating xml, then the performance is huge faster or
not?

Have you profiled your code to see where the bottleneck is?

Fred

With the kind of equation you are providing i.e. you have huge records
to access, it will be better if you perform the 'pure sql query' to
test. it might be a rare practice for others to test and testify.

Quoting Samnang <samnang.chhun@gmail.com>:

Hi all,

Currently, I'm developing a rails app that are heavy generating xml
from restful webservice. My xml representation of web service use
nokogiri gem to generates xml format that match expected format from
client. But the problem is data is quite big around 50, 000 records to
pull out from the table(millions records). I just test in my local
machine, it takes about 20 minutes to get the response from the
request.

Do you any ideas on how to optimize this problem? I'm not sure if we
don't use ActiveRecord, and we just use pure sql statement to pull out
the data for generating xml, then the performance is huge faster or
not?

Using SQL and libxml2 (libxml-ruby gem) directly instead of ActiveRecord and
Nokogiri (which calls libxml-ruby) will cut the run time. I would guess
between 2x and 10x, if the code is written with speed in mind. And your code
will be bigger and uglier.

What's cheaper, computer time or programmer time? How many times will this
generation be run? And are there elapsed time constraints (e.g., an excellent
24 hour weather forecast that takes 28 hours to generate isn't useful).

Jeffrey

Does it need to be XML? JSON is much lighter and faster. You can
also use page caching with REST, so subsequent request is just like
Apache serving flat file. Maybe try to use some sort of compression
too? I'm betting the bottleneck is getting the data over HTTP and
loaded by the client, NOT AR getting it out of DB and building XML.

Chris, it has to be XML because I need to pass it directly to Adobe InDesign to place that data on document template. This is a book generation process, so it rarely runs. Like Jeffery mentioned above, maybe I can use pure xml and
libxml2 to gain the speed just only this problem.

Thanks for all of your feedbacks.