Validating uniqueness of field with utf8 chars fails with mysql

Hi!

While trying to get the (edge) rails test suite to pass on my system i
encounter two failures that are showing up in the native mysql suite
and are passing with postgresql and sqlite. Both are taking place when
validating the uniqueness of string fields that contain utf8 chars.

1. activerecord/test/cases/validation/uniqueness_validation_test.rb:54

2. activerecord/test/cases/validation/uniqueness_validation_test.rb:
241

Here is an examplary failing test:

def test_validate_uniqueness_with_utf8_chars
    Topic.validates_uniqueness_of(:title)

    Topic.create("title" => "一二三四五")
    assert !Topic.new("title" => "一二三四五").valid?, "Shouldn't be
valid."
end

result:

  2) Failure:
test_validate_uniqueness_with_utf8_chars(UniquenessValidationTest)
    [/Users/mhennemeyer/Projekte/rails/activerecord/test/cases/
validations/uniqueness_validation_test.rb:67:in
`test_validate_uniqueness_with_utf8_chars'
     /Users/mhennemeyer/Projekte/rails/activesupport/lib/
active_support/testing/setup_and_teardown.rb:60:in `run']:
Shouldn't be valid.
<false> is not true.

The chars that are used in the title field are chinese 12345 (these
are also used in the real code) if someone has problems getting them
to show properly.

I have also opened a ticked with the attempt to get a specialized utf8
test in the test suite and remove the special chars from the basic
test: http://rails.lighthouseapp.com/projects/8994/tickets/3536-separate-tests-for-validate-uniqueness-with-utf-characters
Would be nice if someone could review it.

Could you run these tests and verify (or not) them so i can be sure
that they are not somehow related to my environment?
From activerecord folder:

$ rake test_mysql TEST=test/cases/validations/
uniqueness_validation_test.rb

Thanks in advance

Matthias Hennemeyer

Could you run these tests and verify (or not) them so i can be sure
that they are not somehow related to my environment?

You need to make sure that you have your encoding and collation set to
a unicode-friendly value rather than the default of latin1. I have it
set to utf8_unicode_ci and it works fine.