number to UTF-conversion question

All, Sorry for cross-posting but I am stuck on this thing for a quite some time:

I have a variable x = 1046

How can I convert into UTF-8 character? x.chr does not work for it.

Basically I need to put x in a string as UTF-8 character to display on a page.

Regards,

- newB

All, Sorry for cross-posting but I am stuck on this thing for a quite some time:

I have a variable x = 1046

How can I convert into UTF-8 character? x.chr does not work for it.

Basically I need to put x in a string as UTF-8 character to display on a page.

Regards,

- newB

Do you mean to say that x holds a Unicode code point? If that's the case (since ASCII is a subset of Unicode, x.to_s => "1046" is trivial), then you can use something like this code I wrote a while back:    > ("U+"+('0'*4+x.to_s(16))[-4,4]).to_utf8    => "\320\226"

Of course, you could hide most of that in an Integer#to_utf8 method.

-Rob

Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com

# -*- ruby -*-

class String    # For a string that matches /(?i:U\+?\|\\u)?\d{4}/, return a suitable UTF-8    # string for that character.    def to_utf8      case point = self.match(/[[:xdigit:]]{4}/)[0].to_i(16)      when 0..0x7f        point.chr      when 0x80..0x07ff        x = point & 0b111111        point >>= 6        y = point        "#{(0xC0 | y).chr}#{(0x80 | x).chr}"      when 0x0800..0xFFFF        x = point & 0b111111        point >>= 6        y = point & 0b111111        point >>= 6        z = point        "#{(0xE0 | z).chr}#{(0x80 | y).chr}#{(0x80 | x).chr}"      when 0x10000..0x10FFFF        raise NotImplementedError, "UTF-8 four byte sequences not yet supported"      else        raise ArgumentError, "Values above U+10FFFF are not supported"      end    end end

if __FILE__ == $0    require 'test/unit'    class UnicodeHelperTest < Test::Unit::TestCase      def test_ascii        assert_equal '!', "U+0021".to_utf8, 'EXCLAMATION MARK'        assert_equal 'A', "U+0041".to_utf8, 'UPPERCASE LETTER A'        assert_equal '-', "U+002D".to_utf8, 'HYPHEN-MINUS'        assert_equal '~', "U+007E".to_utf8, 'TILDE'

       assert_equal '!', "0021".to_utf8, 'EXCLAMATION MARK'        assert_equal 'A', "0041".to_utf8, 'UPPERCASE LETTER A'        assert_equal '-', "002D".to_utf8, 'HYPHEN-MINUS'        assert_equal '~', "007E".to_utf8, 'TILDE'

       assert_equal '!', "\\u0021".to_utf8, 'EXCLAMATION MARK'        assert_equal 'A', "\\u0041".to_utf8, 'UPPERCASE LETTER A'        assert_equal '-', "\\u002D".to_utf8, 'HYPHEN-MINUS'        assert_equal '~', "\\u007E".to_utf8, 'TILDE'      end      def test_hi_bit_ascii        assert_equal "\xC2\x80", "U+0080".to_utf8, "C-cedilla"        assert_equal "\xC2\xA4", "U+00A4".to_utf8, "Spanish n-tilde"      end      def test_general_punctuation        assert_equal "\342\200\220", "U+2010".to_utf8, "HYPHEN"        assert_equal "\342\200\221", "U+2011".to_utf8, "NON-BREAKING HYPHEN"        assert_equal "\342\200\222", "U+2012".to_utf8, "FIGURE DASH"        assert_equal "\342\200\223", "U+2013".to_utf8, "EN DASH"        assert_equal "\342\200\224", "U+2014".to_utf8, "EM DASH"        assert_equal "\342\200\225", "U+2015".to_utf8, "QUOTATION DASH"      end    end end __END__

All, Sorry for cross-posting but I am stuck on this thing for a
quite some time:

I have a variable x = 1046

How can I convert into UTF-8 character? x.chr does not work for it.

Basically I need to put x in a string as UTF-8 character to display on a page.

Regards,

- newB

Do you mean to say that x holds a Unicode code point? If that's the case (since ASCII is a subset of Unicode, x.to_s => "1046" is trivial), then you can use something like this code I wrote a while back:   > ("U+"+('0'*4+x.to_s(16))[-4,4]).to_utf8   => "\320\226"    Depend on what you are doing,

[1046].pack('U')

may also be appropriate.

Fred