Hi,
I'm looking for a way to convert html to plain text. Now, I know about strip_tags, but - as the name says - that only strips the tags.
What I need is to get stuff like & and < back to & and < too. Any help?
Thanks, Mathijs
Hi,
I'm looking for a way to convert html to plain text. Now, I know about strip_tags, but - as the name says - that only strips the tags.
What I need is to get stuff like & and < back to & and < too. Any help?
Thanks, Mathijs
You could use some regexp and the hash ERB::Util::HTML_ESCAPE to return the unescaped versions of the characters. - Richard
You might be able to check out some example code in convert_attachment_to plugin:
http://github.com/kete/convert_attachment_to/tree/master
Depending on configuration, it will take an uploaded HTML file (or PDF, MS doc…) and convert it into a plain text attribute, etc. Probably overkill for what what you are after, but might have something you can learn from.
Cheer,
Walter