Hi
I need to convert strings with international characters to strings
with corresponding ASCII codes. For example é, è, ë, and ê (and all
other e-related versions) should convert to e and so on.
Does anyone have a good solution on this?
Kindest regards
Erik
Hi
I need to convert strings with international characters to strings
with corresponding ASCII codes. For example é, è, ë, and ê (and all
other e-related versions) should convert to e and so on.
I once did something very crude, which for your purpose would look
something like this:
def preprocess(query)
normalized = query.chars.normalize :d
processed = ""
normalized.u_unpack.each do |c|
if c >= 0x300 && c < 0x370 #combining marks
else
processed << [c].pack('U*')
end
processed
end
Fred
Create a file core_extensions.rb in /lib/ and stick this in:
require ‘iconv’
class String
def to_ascii
Iconv.iconv("ASCII//IGNORE//TRANSLIT", "UTF-8", self).join.sanitize
rescue
self.sanitize
end
def sanitize
self.gsub(/[^a-z._0-9 -]/i, "").downcase
end
end
Restart your rails server to load the file. Then when you want to convert the string, you just do something like “Thïs ïs à téststrïng”.to_ascii and it will convert the characters to their ascii equivalent.
Best regards
Peter De Berdt
convert é, è, ë, and ê .. to e, etc...
Try
str = DiacriticsFu::escape(source)
with
file /lib/diacritic_fu.rb :
module DiacriticsFu
def self.escape(str)
ActiveSupport::Multibyte::Handlers::UTF8Handler.
normalize(str,:d).
split(//u).
reject { |e| e.length > 1 }.
join
end
end
, by Thibaut Barrère
(found here : http://groups.google.ca/group/MephistoBlog/browse_thread/thread/afe817a4a594ddde
there's even a test suite)
For example, I extended String with
class String
# "Un été À la maison".to_slug(true) == "un-ete-a-la-maison"
def to_slug(force_downcase=false)
str = DiacriticsFu::escape(self)
str.gsub!(/[^a-zA-Z0-9 ]/,"")
str.gsub!(/+/," ")
str.gsub!(/ /,"-")
force_downcase ? str.downcase : str
end
end
Alain
Great advice from everybody. I will try these and see how they work.
Thanks.
Erik