I'm using a Nokogiri based helper to truncate text without breaking HTML tags.
require "rubygems" require "nokogiri"
module TextHelper
def truncate_html(text, max_length, ellipsis = "...") ellipsis_length = ellipsis.length doc = Nokogiri::HTML::DocumentFragment.parse text content_length = doc.inner_text.length actual_length = max_length - ellipsis_length content_length > actual_length ? doc.truncate(actual_length).inner_html + ellipsis : text.to_s end
end
module NokogiriTruncator module NodeWithChildren def truncate(max_length) return self if inner_text.length <= max_length truncated_node = self.dup truncated_node.children.remove
self.children.each do |node| remaining_length = max_length - truncated_node.inner_text.length break if remaining_length <= 0 truncated_node.add_child node.truncate(remaining_length) end truncated_node end end
module TextNode def truncate(max_length) Nokogiri::XML::Text.new(content[0..(max_length - 1)], parent) end end
end
Nokogiri::HTML::DocumentFragment.send(:include, NokogiriTruncator::NodeWithChildren) Nokogiri::XML::Element.send(:include, NokogiriTruncator::NodeWithChildren) Nokogiri::XML::Text.send(:include, NokogiriTruncator::TextNode) On the line content_length > actual_length ? doc.truncate(actual_length).inner_html + ellipsis : text.to_s it appends the ellipse just after the last tag.
On my view I call <%= truncate_html(news.parsed_body, 700, "... Read more.").html_safe %>
The issue is that the text that is being parsed is wrapped in <p></p> tags, causing the view to break.
"Lorem Ipsum</p> ... Read More" So I'd like to ask if anyone knows how to append the ellipse just before the last closing </p> tag using Nokogiri, so the final output turns
"Loren Ipsum... Read More</p>