Using Nokogiri to insert a <span> tag into existing text

All,

I'm using Nokogiri to handle the following problem:

I have a piece of HTML, and for certain text nodes, I need to insert <span> tags into the text of these nodes at a certain place.

What I am doing is finding the place where I want to insert the <span>, let's say index X of the text node's text, and doing the following:

1) Setting the node's text to just what is before index X 2) Adding the <span> as a next sibling to the original node 3) Adding another next sibling to the original node that is another text node, whose contents are the rest of the text in the original node.

I can pass a string with "<span>blah</span>" to Node#add_next_sibling handle #2.

I'm having trouble with creating a new text node and passing it to Node#add_next_sibling though.

1) Does anyone have an example of creating a new text node in Nokogiri?

2) Is there a simpler way to do this than splitting up one text node into 3 nodes?

Many thanks, Wes

All,

I'm using Nokogiri to handle the following problem:

I have a piece of HTML, and for certain text nodes, I need to insert <span> tags into the text of these nodes at a certain place.

What I am doing is finding the place where I want to insert the <span>, let's say index X of the text node's text, and doing the following:

1) Setting the node's text to just what is before index X 2) Adding the <span> as a next sibling to the original node 3) Adding another next sibling to the original node that is another text node, whose contents are the rest of the text in the original node.

I can pass a string with "<span>blah</span>" to Node#add_next_sibling handle #2.

I'm having trouble with creating a new text node and passing it to Node#add_next_sibling though.

1) Does anyone have an example of creating a new text node in Nokogiri?

2) Is there a simpler way to do this than splitting up one text node into 3 nodes?

Yes. If you have a handle to that text node already, simply use the content= method to write your new html into it as text. Build that text up using a regular expression or concatenation in normal Ruby text processing mode. As long as you don't need to further modify that node as if it was a nodeset, this will be the simplest method I can think of.

If you later need to access that span as a new Nokogiri node, you will have to do something more complex.

Walter

Thanks Walter, ended up with this:

doc = Nokogiri::HTML.parse(File.new(merge_path)) nodes = doc.xpath("//text()[contains(.,'MERGE')]") nodes.each do |node|   text = node.text   if md = text.match(/.{2}MERGE(\d+).{2}/)     start_index = text.index(md[0])     start_span_tag = "<span id='merge_#{md[1]}'>"     end_index = start_index + start_span_tag.length + md[0].length     node.content = text.insert(start_index, start_span_tag).insert(end_index, '</span>')   end end

Works great.

Wes

This doesn't work - when I write this back out to a file the <span> is escaped.

Perhaps I should have mentioned that I needed to re-serialize the resulting HTML.

Wes

Node.content= takes text, not more nodes. Try using Node.inner_html = instead.

Walter

This works:

#Surround all of the text (NOT attribute value) merge fields with <span> tags for ease of manipulation later

doc = Nokogiri::HTML.parse(html) nodes = doc.xpath("//text()[contains(.,'MERGE')]") nodes.each do |node|   text = node.text.dup   if md = text.match(/.{2}MERGE(\d+).{2}/)     start_index = text.index(md[0])     end_index = start_index + md[0].length     node = node.replace(text[0..start_index - 1])[0]     node.after("<span id='merge_#{md[1]}'>#{md[0]}</span>#{text[end_index..-1]}")    end end

I tried Node.inner_html= to no avail, setting it to the string that resulted if I interpolated the <span> where I wanted it. Not sure why it didn't work, but the replace/after works, so I went with that.

Thanks for the help.

Wes

That's because Node.inner_html takes nodes, not strings as input. Glad you got it working.

Walter