utf-8 or encoding problems, need help!

Dmitry, my gut feeling is that you have to enforce POST encoding in the form at least, or otherwise detect when you have not received a utf-8 encoded POST data string.

I am at a loss as to how a latin-1 string ended up *bigger* than a UTF-8 one, but its possible that you might have encountered some cut&paste artifacts. Try entering an umlaut using the character map (i.e. more naturally).

Well yes, but its not Rails fault. In fact anyone can pass any kind of information to any kind of web system. Your system has to be robust enough to handle it.

Even by your best efforts to ensure everything comes across as utf-8, users can still force it to be something that won't display properly, like latin-1, Shift-JIS or whatever. In those cases you have to detect that you have received an invalid encoding and either convert it to utf-8 or send back an error message.

I just thought that a particularly clever hacker might be able to exploit encoding confusion with multi-byte encoding systems to get around cross-site-scripting defences. Its just a thought, and I am thinking in general, not in a Rails context (which has some fairly serious XSS defences)