Cleaning up HTML markup from external sources

I'm using acts_as_amazon_product to pull content about books I own. This makes it very easy for me to pull down lots of data about books I own without having to enter all that stuff myself.

It even includes a description of sorts. The problem is, while this is some form of HTML markup, it is not XHTML. For instance, I see things like <li>...<li>... and <p>...<p>...

Perhaps I'm trying too hard, but I'd love this to be valid XHTML.

Can anyone recommend a way to clean this up in a fairly reliable way? I don't mind doing this at display time.

Thanks, --Michael