nuno <rails-mailing-list@...> writes:
Hello, I'm looking for an HTML parser that can handle bad formed input
There's a pretty good HTML parser in RoR ActionPack but it's doesn't
handle bad formed documents
Just a technical point: Unclosed tags are _not_ badly formed in HTML, they are
exactly the _right_ way to do things in HTML. HTML is not supposed to be an XML
based language, and self-closing tags is invalid.
That said, I agree with the person who said it's better to just treat it a one
long string and regex it.