Regex in Ruby - Strip HTML out of comments - help

Typo extends the String class to add the following method:

Strips any html markup from a string

TYPO_TAG_KEY = TYPO_ATTRIBUTE_KEY = /[\w:_-]+/ TYPO_ATTRIBUTE_VALUE = /(?:[A-Za-z0-9]+|(?:‘[^’]?'|“[^”]?"))/

TYPO_ATTRIBUTE = /(?:#{TYPO_ATTRIBUTE_KEY}(?:\s*=\s*#{TYPO_ATTRIBUTE_VALUE})?)/ TYPO_ATTRIBUTES = /(?:#{TYPO_ATTRIBUTE}(?:\s+#{TYPO_ATTRIBUTE}))/ TAG = %r{<[!/?[]?(?:#{TYPO_TAG_KEY}|–)(?:\s+#{TYPO_ATTRIBUTES})?\s(?:[!/?]]+|–)?>}

def strip_html self.gsub(TAG, ‘’).gsub(/\s+/, ’ ').strip end

I haven’t run into any edge cases of it failing yet, but I am sure if anyone finds one a bug report would be welcome :slight_smile: