Typo extends the String class to add the following method:
Strips any html markup from a string
TYPO_TAG_KEY = TYPO_ATTRIBUTE_KEY = /[\w:_-]+/ TYPO_ATTRIBUTE_VALUE = /(?:[A-Za-z0-9]+|(?:‘[^’]?'|“[^”]?"))/
TYPO_ATTRIBUTE = /(?:#{TYPO_ATTRIBUTE_KEY}(?:\s*=\s*#{TYPO_ATTRIBUTE_VALUE})?)/ TYPO_ATTRIBUTES = /(?:#{TYPO_ATTRIBUTE}(?:\s+#{TYPO_ATTRIBUTE}))/ TAG = %r{<[!/?[]?(?:#{TYPO_TAG_KEY}|–)(?:\s+#{TYPO_ATTRIBUTES})?\s(?:[!/?]]+|–)?>}
def strip_html self.gsub(TAG, ‘’).gsub(/\s+/, ’ ').strip end
I haven’t run into any edge cases of it failing yet, but I am sure if anyone finds one a bug report would be welcome