Translating international characters

Hi

I need to convert strings with international characters to strings with corresponding ASCII codes. For example é, è, ë, and ê (and all other e-related versions) should convert to e and so on.

Does anyone have a good solution on this?

Kindest regards

Erik

Hi

I need to convert strings with international characters to strings with corresponding ASCII codes. For example é, è, ë, and ê (and all other e-related versions) should convert to e and so on.

I once did something very crude, which for your purpose would look
something like this:

     def preprocess(query)        normalized = query.chars.normalize :d        processed = ""        normalized.u_unpack.each do |c|          if c >= 0x300 && c < 0x370 #combining marks          else            processed << [c].pack('U*')          end        processed      end

Fred

Create a file core_extensions.rb in /lib/ and stick this in:

require ‘iconv’

class String

def to_ascii

Iconv.iconv("ASCII//IGNORE//TRANSLIT", "UTF-8", self).join.sanitize

rescue

self.sanitize

end

def sanitize

self.gsub(/[^a-z._0-9 -]/i, "").downcase

end

end

Restart your rails server to load the file. Then when you want to convert the string, you just do something like “Thïs ïs à téststrïng”.to_ascii and it will convert the characters to their ascii equivalent.

Best regards

Peter De Berdt

convert é, è, ë, and ê .. to e, etc...

Try     str = DiacriticsFu::escape(source) with

file /lib/diacritic_fu.rb : module DiacriticsFu   def self.escape(str)       ActiveSupport::Multibyte::Handlers::UTF8Handler.             normalize(str,:d).             split(//u).             reject { |e| e.length > 1 }.             join   end end

, by Thibaut Barrère (found here : http://groups.google.ca/group/MephistoBlog/browse_thread/thread/afe817a4a594ddde there's even a test suite)

For example, I extended String with class String   # "Un été À la maison".to_slug(true) == "un-ete-a-la-maison"   def to_slug(force_downcase=false)     str = DiacriticsFu::escape(self)     str.gsub!(/[^a-zA-Z0-9 ]/,"")     str.gsub!(/+/," ")     str.gsub!(/ /,"-")     force_downcase ? str.downcase : str   end end

Alain

Great advice from everybody. I will try these and see how they work. Thanks.

Erik