Problem with internationalization

I have I18n working.

I translated my en.yml file to German (de.yml) and hae set DE as the default language in environment.rb.

Everything is almost working.

My problem is that I am getting this   � char ... instead of this   ä on the screen displays.

I have set   <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> and also tried   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> and neither produce the right result.

Any suggestions?

Have you checked your text editor encoding? Maybe it's silly to ask about that, but some IDEs like NetBeans ocassionaly have small problems with correct encoding :slight_smile:

Paweł K wrote:

Have you checked your text editor encoding? Maybe it's silly to ask about that, but some IDEs like NetBeans ocassionaly have small problems with correct encoding :slight_smile:

As far as I know, the encoding is correct. All the text editors I am using (V-Slick, Textmate, Notepad) seem to see the same proper things.

What is actually in the output - utf8 or iso latin 1 (ie look at the generated html with a hex editor) ?

Fred

What is actually in the output - utf8 or iso latin 1 (ie look at the generated html with a hex editor) ?

Fred

I believe it to be iso-8859-1.

I viewed the page source and saved it to disk. I hope the relevant sections are

- - - - - - - (From the program WinHex which examined the page source

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F

00000DE0 66 69 72 6D 61 74 69 6F 6E 22 3E 4B 65 6E 6E 77 firmation">Kennw 00000DF0 6F 72 74 62 65 73 74 E4 74 69 67 75 6E 67 2A 3C ortbestätigung*< 00000E00 2F

- - - - - - -

(From the screen)

Kennwortbest�tigung*

- - - - - - - /

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//DE"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;&gt; <html xmlns="http://www.w3.org/1999/xhtml&quot; xml:lang="de" lang="de">   <head>     <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">     <title>Sign Up</title>     <link href="/stylesheets/application.css?1264903648" media="screen" rel="stylesheet" type="text/css" /> <link href="/stylesheets/formtastic.css?1265580898" media="screen" rel="stylesheet" type="text/css" /> <link href="/stylesheets/formtastic_changes.css?1265580898" media="screen" rel="stylesheet" type="text/css" />     <script src="/javascripts/jquery-1.4.1.js?1264986334" type="text/javascript"></script> <script src="/javascripts/application.js?1263055671" type="text/javascript"></script>

          <!-- Footnotes Style -->         <style type="text/css">           #footnotes_debug {margin: 2em 0 1em 0; text-align: center; color: #444; line-height: 16px;}           #footnotes_debug th, #footnotes_debug td {color: #444; line-height: 18px;}           #footnotes_debug a {text-decoration: none; color: #444; line-height: 18px;}           #footnotes_debug table {text-align: center;}           #footnotes_debug table td {padding: 0 5px;}           #footnotes_debug tbody {text-align: left;}           #footnotes_debug .name_values td {vertical-align: top;}           #footnotes_debug legend {background-color: #FFF;}           #footnotes_debug fieldset {text-align: left; border: 1px dashed #aaa; padding: 0.5em 1em 1em 1em; margin: 1em 2em; color: #444; background-color: #FFF;}           /* Aditional Stylesheets */             #queries_debug_info table td, #queries_debug_info table th{border:1px solid #A00; padding:0 3px; text-align:center;}   #queries_debug_info table thead, #queries_debug_info table tbody {color:#A00;}   #queries_debug_info p {background-color:#F3F3FF; border:1px solid #CCC; margin:12px; padding:4px 6px;}   #queries_debug_info a:hover {text-decoration:underline;}

        </style>         <!-- End Footnotes Style --> </head>

- - - - - --

I have run Mongrel and Webrick and I get the same results.

I Have run Firefox and IE ... almost the same results

What's the magic formula for getting the rendering engine to think that the page source is iso-8859-1?

Once I get that done then I will want to know how to get Ruby/Rails to generate UTF-8 and/or Unicode.

Ralph Shnelvar wrote:

What is actually in the output - utf8 or iso latin 1 (ie look at the generated html with a hex editor) ?

Fred

I believe it to be iso-8859-1.

I viewed the page source and saved it to disk. I hope the relevant sections are

- - - - - - - (From the program WinHex which examined the page source

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F

00000DE0 66 69 72 6D 61 74 69 6F 6E 22 3E 4B 65 6E 6E 77 firmation">Kennw 00000DF0 6F 72 74 62 65 73 74 E4 74 69 67 75 6E 67 2A 3C ortbestätigung*< 00000E00 2F

- - - - - - -

(From the screen)

Kennwortbest�tigung*

Yes, that seems to be Latin-1. Why are you using Latin-1 and not UTF-{8|16} in the first place? They are supersets of Latin-1.

- - - - - - - /

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//DE"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;&gt; <html xmlns="http://www.w3.org/1999/xhtml&quot; xml:lang="de" lang="de">   <head>     <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">     <title>Sign Up</title>     <link href="/stylesheets/application.css?1264903648" media="screen" rel="stylesheet" type="text/css" /> <link href="/stylesheets/formtastic.css?1265580898" media="screen" rel="stylesheet" type="text/css" /> <link href="/stylesheets/formtastic_changes.css?1265580898" media="screen" rel="stylesheet" type="text/css" />     <script src="/javascripts/jquery-1.4.1.js?1264986334" type="text/javascript"></script> <script src="/javascripts/application.js?1263055671" type="text/javascript"></script>

[...]

You've got a number of problems here due to poor development decisions.

* You've declared your document as XHTML, but not used an <?xml ?> prolog. Therefore, it will be interpreted as UTF-8 as described at Character encodings .

* But you shouldn't be generating XHTML anyway. Browser support for XHTML is extremely problematic. See http://hixie.ch/advocacy/xhtml for further information. Use HTML 4 or 5 instead. If you use HTML 4, make sure to install the html_output plugin so that Rails will not use the XML-style <self-closing tag/> syntax, which is not valid in HTML 4.

* But even with HTML 4, I confess I don't see a single reason not to use UTF-8 on your pages. It will handle any character you throw at it, so you don't need to use different encodings for different languages.

- - - - - --

I have run Mongrel and Webrick and I get the same results.

I Have run Firefox and IE ... almost the same results

What's the magic formula for getting the rendering engine to think that the page source is iso-8859-1?

Once I get that done then I will want to know how to get Ruby/Rails to generate UTF-8 and/or Unicode.

Don't waste your time with Latin-1. Go straight to UTF-8. It will be easier and more versatile.

Best,

Yes, that seems to be Latin-1. Why are you using Latin-1 and not UTF-{8|16} in the first place? They are supersets of Latin-1.

I am aware of this, Marnen.

What I don't know is how to get Ruby/Rails to generate UTF-8 and/or Unicode.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//DE"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd&quot;&gt; <html xmlns="http://www.w3.org/1999/xhtml&quot; xml:lang="de" lang="de">   <head>     <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

You've got a number of problems here due to poor development decisions.

* You've declared your document as XHTML, but not used an <?xml ?> prolog. Therefore, it will be interpreted as UTF-8 as described at Character encodings .

OK ... that sounds like the problem ... I think.

* But you shouldn't be generating XHTML anyway. Browser support for XHTML is extremely problematic. See http://hixie.ch/advocacy/xhtml for further information. Use HTML 4 or 5 instead. If you use HTML 4, make sure to install the html_output plugin so that Rails will not use the XML-style <self-closing tag/> syntax, which is not valid in HTML 4.

Ok ... which do you recommend ... 4 or 5?

* But even with HTML 4, I confess I don't see a single reason not to use UTF-8 on your pages. It will handle any character you throw at it, so you don't need to use different encodings for different languages.

- - - - - --

I have run Mongrel and Webrick and I get the same results.

I Have run Firefox and IE ... almost the same results

What's the magic formula for getting the rendering engine to think that the page source is iso-8859-1?

Once I get that done then I will want to know how to get Ruby/Rails to generate UTF-8 and/or Unicode.

Don't waste your time with Latin-1. Go straight to UTF-8. It will be easier and more versatile.

OK ... so how to I get Ruby/Rails to generate UTF-8?

Ralph Shnelvar wrote:

Yes, that seems to be Latin-1. Why are you using Latin-1 and not UTF-{8|16} in the first place? They are supersets of Latin-1.

I am aware of this, Marnen.

What I don't know is how to get Ruby/Rails to generate UTF-8 and/or Unicode.

Why do you think you need to do something special here? Rails doesn't "generate" text, for the most part; it just manipulates and hands back what you give it.

In other words: 1. Given the ASCII string "Ralph Shnelvar", how would you have Rails print "Ralph Shnelvar"? (Hint:you should already know the answer to this.)

2. Given the UTF-8 string "ラルフ・シュネルヴァー", how would you have Rails print "ラルフ・シュネルヴァー"? (Hint: the answer is the same as for #1.)

Of course, you need to set the proper encoding headers in your HTML, and you need to have your DB encoding be UTF-8, but those are external to Rails.

[...]

* You've declared your document as XHTML, but not used an <?xml ?> prolog. Therefore, it will be interpreted as UTF-8 as described at Character encodings .

OK ... that sounds like the problem ... I think.

I believe that's the immediate problem, yes.

* But you shouldn't be generating XHTML anyway. Browser support for XHTML is extremely problematic. See http://hixie.ch/advocacy/xhtml for further information. Use HTML 4 or 5 instead. If you use HTML 4, make sure to install the html_output plugin so that Rails will not use the XML-style <self-closing tag/> syntax, which is not valid in HTML 4.

Ok ... which do you recommend ... 4 or 5?

I've been using HTML 4. I don't know how well HTML 5 is supported by browsers currently in use, but that's because I haven't done the research yet. I'm sure others know more than I do on this point.

Best,

I tried to read through the thread but didn’t actually find an answer… Isn’t the solution to make sure that en.yml is UTF-8 encoded, ie it in UTF-8?

–Lasse

Lasse Bunk wrote:

I tried to read through the thread but didn't actually find an answer... Isn't the solution to make sure that en.yml is UTF-8 encoded, ie it in UTF-8?

That's part of the solution, yes -- this is another case of Rails simply processing the text you give it. Sorry; I should have been a bit clearer about that.

--Lasse

Best,

Most of us use UTF-8 defaulting text editors to start with, it escapes use anyone would be using a Windows text editor from the eighties :slight_smile:

Best regards

Peter De Berdt