(NOT mysql this time) incompatible character encodings: ASCII-8BIT and UTF-8


I fetched a forked mysql driver and got rid of this error for strings
fetched from the database.

BUT all I had to do to see this message again was to enter a non-ASCII
character in a (login) form field and there it was!

It doesn't matter that I don't even allow non-ASCII characters right
now because there's a validation regex restricting to [a-Z0-9].

That means any user can disable my application with the innocent usage
of a perfectly legal character.

Last time I inquired about this issue was last September, and after
hearing "no solution" I abandoned any attempts of pre-beta Rails 3
testing. This is so essential, and so easy to trigger! - I wonder why
it's still so easy to cause a major disaster?

Oh, btw, using the latest git/svn versions (as of this minute) of
Rails3-master and Ruby 1.9.2-head.


Not sure if it'll help you or not (especially since there isn't an
ActiveRecord driver yet) but my new Mysql2 gem forces the use of UTF-8
for connections to MySQL as well as strings in 1.9 - http://github.com/brianmario/mysql2

I'll be working on and releasing an ActiveRecord driver soon, but
would love some help if anyone knows of *any* docs on a driver spec?

Anyway, I'm curious if some simple tests with this gem fix your issue.


As I say in the subject line, it is NOT about mysql, I solved the
mysql-UTF issue already, so it won't help, sorry ;-(

Yeah, it seems my post started a bit misleading.

The issue is I get the error when the parameters sent to Rails from
the form look like this (look at "login", which contains "mörre"):

Parameters: {"authenticity_token"=>"...", "user"=>{"login"=>"m
\xC3\xB6rre", "password"=>"[FILTERED]", "remember_me"=>"0"},

This crashes Rails 3, here's the report:

Extracted source (around line #86):

83: <%= "<div class=\"flash #{key} round_large\"><span
class=\"icon\">&nbsp;</span><div class=\"flash_text\">#{flash[key]}</

</div>".html_safe if flash[key] %>

84: <%- end -%>
85: <div id="bd">
86: <%= yield %>
87: </div>
88: <div id="ft">&nbsp;</div>
89: </div>

The views themselves contain a lot of UTF-8 characters, German Umlaute
ÄÖÜäöü :slight_smile:

Another example:

1) Creation of a URL by link_to when a parameter contains non-ASCII

When the tag is "Gemälde" the generated URL is <a class="tag5" href="/

That's okay (by the way, is it necessary, do non-ASCII values in
parameters have to be encoded?).


2) In the controller I have to add "force_encoding('UTF-8')" in order
to prevent the dreaded error page from Rails:

@tags_string = params[:tags].force_encoding('UTF-8')

Now what? Do I have to add this to EVERY SINGLE string parameter I
receive? Because if I don't, even if you THINK you really only use
ASCII characters in your app, as soon as a user enters a non-ASCII
character in a form they'll cause the Rails (Ruby) conversion error

Just to finalize this thread: this (accepted, it seems) bug can now be
tracked on lighthouse: