Cleaning up HTML markup from external sources

I'm using acts_as_amazon_product to pull content about books I own.
This makes it very easy for me to pull down lots of data about books I
own without having to enter all that stuff myself.

It even includes a description of sorts. The problem is, while this
is some form of HTML markup, it is not XHTML. For instance, I see
things like <li>...<li>... and <p>...<p>...

Perhaps I'm trying too hard, but I'd love this to be valid XHTML.

Can anyone recommend a way to clean this up in a fairly reliable way?
I don't mind doing this at display time.