convert accentend chars to their base

is there a good way to convert "special" accented chars to their base
chars?
as an example i want "àéìòù" => "aeiou"
i'm using several gsub now
"àèìòù".gsub("à","a").gsub("è","e").gsub("ì","i")...
it works but i wonder if there is something better than this.

Well something like

char_from = “àéìòù”

char_to = “aeiou”

x = “àéìòù”.gsub(char_from, char_to)

puts x

would at least make the code more maintainable

your code convert only that sequence of character. what i need is to
convert those accented char in every word.
so "città" => "citta", "caffè" => "caffe" and so on
maybe some regexp?

Mmm

Just noticed another problem

char_from = “àéìòù”

char_to = “aeiou”

puts char_from.size => 10

puts char_to.size => 5

At least on my Mac. The problem here is encoding.

Looks trickier than I first thought, would be a cinch if this was unicode and we were using Java :slight_smile: Just decompose the unicode character and drop the accent characters.

Ignore everything I have said and lets hope someone who knows about this can suggest a solution, I am intrigued by this.

I'm having this very same problem when String.upcase() is not
uppercasing accentuated characters.
It seems that the problem is the encoding again.

regards

eugenio wrote:

your code convert only that sequence of character. what i need is to
convert those accented char in every word.
so "citt�" => "citta", "caff�" => "caffe" and so on
maybe some regexp?

You'll probably have to convert your text to normal form D or KD, then
filter out combining marks.

Best,