(NOT mysql this time) incompatible character encodings: ASCII-8BIT and UTF-8

Hi,

I fetched a forked mysql driver and got rid of this error for strings fetched from the database.

BUT all I had to do to see this message again was to enter a non-ASCII character in a (login) form field and there it was!

It doesn't matter that I don't even allow non-ASCII characters right now because there's a validation regex restricting to [a-Z0-9].

That means any user can disable my application with the innocent usage of a perfectly legal character.

Last time I inquired about this issue was last September, and after hearing "no solution" I abandoned any attempts of pre-beta Rails 3 testing. This is so essential, and so easy to trigger! - I wonder why it's still so easy to cause a major disaster?

Oh, btw, using the latest git/svn versions (as of this minute) of Rails3-master and Ruby 1.9.2-head.

Michael

Not sure if it'll help you or not (especially since there isn't an ActiveRecord driver yet) but my new Mysql2 gem forces the use of UTF-8 for connections to MySQL as well as strings in 1.9 - http://github.com/brianmario/mysql2

I'll be working on and releasing an ActiveRecord driver soon, but would love some help if anyone knows of *any* docs on a driver spec?

Anyway, I'm curious if some simple tests with this gem fix your issue.

-Brian

As I say in the subject line, it is NOT about mysql, I solved the mysql-UTF issue already, so it won't help, sorry ;-(

Yeah, it seems my post started a bit misleading.

The issue is I get the error when the parameters sent to Rails from the form look like this (look at "login", which contains "mörre"):

Parameters: {"authenticity_token"=>"...", "user"=>{"login"=>"m \xC3\xB6rre", "password"=>"[FILTERED]", "remember_me"=>"0"}, "commit"=>"..."}

This crashes Rails 3, here's the report:

Extracted source (around line #86):

83: <%= "<div class=\"flash #{key} round_large\"><span class=\"icon\">&nbsp;</span><div class=\"flash_text\">#{flash[key]}</

</div>".html_safe if flash[key] %>

84: <%- end -%> 85: <div id="bd"> 86: <%= yield %> 87: </div> 88: <div id="ft">&nbsp;</div> 89: </div>

The views themselves contain a lot of UTF-8 characters, German Umlaute ÄÖÜäöü :slight_smile:

Another example:

1) Creation of a URL by link_to when a parameter contains non-ASCII characters

When the tag is "Gemälde" the generated URL is <a class="tag5" href="/ items?tags=Gem%C3%A4lde">

That's okay (by the way, is it necessary, do non-ASCII values in parameters have to be encoded?).

HOWEVER,

2) In the controller I have to add "force_encoding('UTF-8')" in order to prevent the dreaded error page from Rails:

@tags_string = params[:tags].force_encoding('UTF-8')

Now what? Do I have to add this to EVERY SINGLE string parameter I receive? Because if I don't, even if you THINK you really only use ASCII characters in your app, as soon as a user enters a non-ASCII character in a form they'll cause the Rails (Ruby) conversion error message.

Just to finalize this thread: this (accepted, it seems) bug can now be tracked on lighthouse:

https://rails.lighthouseapp.com/projects/8994/tickets/4336-ruby19-submitted-string-form-parameters-with-non-ascii-characters-cause-encoding-errors#ticket-4336-1