pretty-print and cleanse RHTML?

Suppose someone gave us fresh HTML to import as eRB (.rhtml). Such as from an obsolete PHP project. We ought to upgrade, cleanse, and pretty-print that HTML like this...

tidy -i -asxhtml old.html > new.rhtml

Below my sig is a program to temporarily replace <% and %>
with <!--% and %--> and run Tidy. Save it as 'tidyErb.rb', and use this usage line:

usage: ruby tidyErb.rb <filename.rhtml> >output.rhtml

'filename.rhtml' and 'output.rhtml' may not be the same file. The program wastes a file called 'scratch.html', with no attempt to avoid any source files with the same name...

As a convenience, the program reports diagnostics to STDERR. Obey them (per assert_tidy), to improve your programs!

Note that Tidy treats <!-- --> as flow-tags not block-tags. (My verbiage. <em> is a flow-tag, and <div> is the cannonical block-tag. Tidy line-wraps the former.)

Searching for <% and moving them to their correct indentation (such as for <% end %>) is a small price to pay for clean HTML!

Oh, also, review my gsubs to see if they match what Tidy did to your RHTML's comments, and <%%> nested inside attributes. If I return to this project, I will just upgrade Tidy...