fastest ruby xml parser -- FreeBSD 6.2

Hi,

I'm trying to find the fastest XML parser for FreeBSD 6.2. I tried to install the gem for libxml, but it doesn't install on FreeBSD 6.2. I also tried the FreeBSD port for libxml-ruby, but it appears to be very buggy.

I'm taking several xml feeds, 100k or less, parsing them, and putting them into a database. rexml is slow. The word on the net is to use a C parser -- is there a good one I can use, with understandable documentation for a newbie?

Thanks,

Charlie

Hi, please define what you mean by slow. What times are you seeing during the parsing of the XML?

-Conrad

charlie caroff wrote:

Hi,

I'm trying to find the fastest XML parser for FreeBSD 6.2. I tried to install the gem for libxml, but it doesn't install on FreeBSD 6.2. I also tried the FreeBSD port for libxml-ruby, but it appears to be very buggy.

I'm taking several xml feeds, 100k or less, parsing them, and putting them into a database. rexml is slow. The word on the net is to use a C parser -- is there a good one I can use, with understandable documentation for a newbie?

You could try Hpricot.

Hpricot is extremely slow on large documents (> 50 MB) but should work fast on the size you are looking at. libxml would still be the fastest option if you get it working (I had to strip all tags from the root node to get it working for me, but I am dealing with 5-10 GB files so I had no other choice but to go with libxml-ruby).

Craig

Conrad Taylor wrote:

Hi, please define what you mean by slow. What times are you seeing during the parsing of the XML?

-Conrad

    Hi,

    I'm trying to find the fastest XML parser for FreeBSD 6.2. I tried to     install the gem for libxml, but it doesn't install on FreeBSD 6.2. I     also tried the FreeBSD port for libxml-ruby, but it appears to be very     buggy.

    I'm taking several xml feeds, 100k or less, parsing them, and putting     them into a database. rexml is slow. The word on the net is to use a     C parser -- is there a good one I can use, with understandable     documentation for a newbie?

    Thanks,

    Charlie

I use the expat library - one of the samples that comes with the code is simple enough to extend if you know C. My suggestion would be to parse off the XML quickly and convert it into another format that can then be used by your Ruby code.

It would be good if you could explain how often you need to do this and if it's only in the backend or in response to a user-loaded XML file.

Cheers, Mohit. 8/15/2007 | 11:39 PM.