Parsing XML file with no style info with Hpricot

Hello,

I've been trying for hours to parse an XML using Hpricot. Usually it's not a problem. Here's my simple code:

#This works and outputs the proper xml data @url1 = 'http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml@page1 = Hpricot(open(@url1)) <%= @page 1 %>

#This does not work, and I'm scratching my head @url1 = 'http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml@page1 = Hpricot(open(@url1)) <%= @page 1 %>

The gd2.mlb.com XML file does not have any style information according to Firefox. I can read it using Oxygen. Can somebody provide me with a hint on how to parse the mlb.com XML? Thanks!

-A

Any idea how to parse this XML?

-A

Allan Last wrote:

And I'm scratching mine trying to guess what you mean by "does not work" ...

Hpricot is not parsing the MLB xml file. I'm thinking the reason that it is not reading the MLB xml file is because it is not in a standard XML format.

If you give my code a quick try, you'll notice that it will read other XML files, but not the MLB XML.

#This works and outputs the proper xml data @url1 = 'http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml@page1 = Hpricot(open(@url1)) <%= @page1 %>

#This does not work, and I'm scratching my head @url1 = 'http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml@page1 = Hpricot(open(@url1)) <%= @page1 %>

Hassan Schroeder wrote:

Actually, I already did, and it seems to work just fine. Hence my own head-scratching. :slight_smile:

So, again, maybe you can say *exactly* what you expect to happen and how that differs from what you're seeing.

Hi Hassan,

This picture: http://picasaweb.google.com/lh/photo/Qf4DFta9p5ERoCRb6Lbd2Q?feat=directlink

This is the parsed output from the feed from the sportingnews XML file. It is displayed on my view with <%= @page1 %>.

This picture: http://picasaweb.google.com/lh/photo/xLVr8_U-x12rJnADs_qcEw?feat=directlink

The blank space what is displayed on the view with <%= @page1 %> using the MLB XML file.

I'm expecting the XML information seen here on Firefox: http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=directlink

to be displayed when I parse the MLB file. Hpricot is not parsing this file.

-A

Hassan Schroeder wrote:

I'm expecting the XML information seen here on Firefox:http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=dire

to be displayed when I parse the MLB file. Hpricot is not parsing this file.

Have you tried viewing the source of the page generated by your view? I suspect hpricot is parsing the file but just blatting it into the view like that is producing invalid html which your browser is not rendering.

Fred

I'm expecting the XML information seen here on Firefox:

/

to be displayed when I parse the MLB file. Hpricot is not parsing this file.

Sure it is -- use irb to examine what's in @page1.

As Frederick already suggested, you apparently have a view problem, not an Hpricot parsing problem.

Thanks everybody. I saw the info on the source. I figured it out.

-A

Hassan Schroeder wrote: