require "nokogiri"
doc = Nokogiri::HTML::Document.new("<title> Save the page! </title>") doc.class # => Nokogiri::HTML::Document
doc = Nokogiri::HTML::Document.parse <<-eof <head> <meta name="description" content="Free Web tutorials"> <meta name="keywords" content="HTML,CSS,XML,JavaScript"> <meta name="author" content="Ståle Refsnes"> <meta charset="UTF-8"> </head> eof
I think the problem is that when nokogiri parses html, it assumes html 4.0 transitional, as is evidenced by the DOCTYPE.
I'm not sure how to get it to deal with HTML 5....