improve reliability of Inflector.transliterate


I'd like to ask for some feedback on a patch I just submitted.

The patch improves the reliability of ActiveSupport::Inflector.transliterate, which currently does not handle many characters from Danish, Swedish, Icelandic, Polish and other European languages.

This is because the current code relies upon UTF-8 decomposition, but many common characters do not in fact decompose to an ASCII letter and a diacritic. Two very common examples are Scandanavian "ø" and German "ß".

If you don't speak any of the affected languages, imagine if the current method deleted all ocurrences of the letter "s" from English strings, and so generated paramater strings like: "234-aint-loui-cardinal" rather than "234-saint-louis-cardinals".

In a nutshell, the difference to developers is:

Inflector.parameterize("Ærøskøbing") # before patch: "rskbing" # after patch: "aeroskobing"

Patch: gist:363923 · GitHub LH ticket: #4374 Inflector#transliterate fails on many European characters - Ruby on Rails - rails