XML behaviour issue

I am trying to parse both rss feed and atom feed using Nokogiri. I'm meeting a problem where i cannot use one query to universally select both rss element and atom element.

E.g.

@item = "item" root.xpath(//#{@item}) will select for rss @item = "entry" root.xpath(//#{@item}) will not select for rss

Even though the entry tag has no explicit namespace attached to it. I understand this is an implicit namespace issue which is in the xml specifications. However, I am wondering if there's a way to circumvent this issue to allow nokogiri to somehow modify the document so that the xpath on implicitly namespaced element would work without a namespace. Thanks!

Kefan X. wrote in post #968930:

I am trying to parse both rss feed and atom feed using Nokogiri.

There's probably a library that does this already; if there is, you should use it if possible. But read on for other ideas.

I'm meeting a problem where i cannot use one query to universally select both rss element and atom element.

E.g.

@item = "item" root.xpath(//#{@item}) will select for rss @item = "entry" root.xpath(//#{@item}) will not select for atom

Even though the entry tag has no explicit namespace attached to it. I understand this is an implicit namespace issue which is in the xml specifications. However, I am wondering if there's a way to circumvent this issue to allow nokogiri to somehow modify the document so that the xpath on implicitly namespaced element would work without a namespace.

Do you want to use XSL or something (perhaps Ruby!) to transform RSS to Atom (or vice versa)? Do I understand correctly?

If not, then I think this is an XPath question, and might be better directed to an XML forum.

Alternatively, you could detect whether the feed is RSS or Atom, then build the XPath expression accordingly.

Thanks!

Best,

Marnen Laibow-Koser wrote in post #968931:

Kefan X. wrote in post #968930:

I am trying to parse both rss feed and atom feed using Nokogiri.

There's probably a library that does this already; if there is, you should use it if possible. But read on for other ideas.

I'm meeting a problem where i cannot use one query to universally select both rss element and atom element.

E.g.

@item = "item" root.xpath(//#{@item}) will select for rss @item = "entry" root.xpath(//#{@item}) will not select for atom

Even though the entry tag has no explicit namespace attached to it. I understand this is an implicit namespace issue which is in the xml specifications. However, I am wondering if there's a way to circumvent this issue to allow nokogiri to somehow modify the document so that the xpath on implicitly namespaced element would work without a namespace.

Do you want to use XSL or something (perhaps Ruby!) to transform RSS to Atom (or vice versa)? Do I understand correctly?

Yes, I need to transform one format to another.

If not, then I think this is an XPath question, and might be better directed to an XML forum.

This is not really an xpath question as the specs is in my opinion could use a different approach. It is a fundamental obstacle. Therefore I am asking whether or not there is a nokogiri workaround to solve this problem.

Alternatively, you could detect whether the feed is RSS or Atom, then build the XPath expression accordingly.

Yea, this is the last scenario I want to approach but I agree it is the simplest method. Thanks a lot for your help though!