accented character conflation? [SOLVED]

Clearly it is a mysql issue, I plead ignorance when it comes to the intricacies of collations in mysql but since I may not be the only one, here is a little tutorial.

Mysql allows charsets and collations on various levels (server, database, table, column) that determine ordering and conflation (equality) of strings.

Though even the server level can be dynamically set, once it is set for a db/table, resetting collation for serve or database will not influence the queries.

for ut8, there are two collations,

  • utf8_bin which is sensitive to all differences
  • utf8_general_ci (ci = case insensitive, but incidentally this collation is also accent insensitive) there is NO collation which is case sensitive but accent insensitive or the other way round. The default collation for utf8 is utf8_general_ci (Note that accent insensitivity for this means that there is inconsistency in the naming of collations, since for instance latin1_general_ci is case sensitive, see http://bugs.mysql.com/bug.php?id=19567 )

In rails you can easily change that default on the database level simply by setting collation: utf8_bin in config/database.yml and rake will also take this into account for tasks like db:create

Setting ActiveRecord::Base.connection.execute “SET collation_database = ‘utf8_bin’” only have an effect for creating new tables.

You can also manipulate the connection (server-client communication) encoding dynamically, however, importantly http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.

Most relevantly if you want to dymanically manipulate collation for queries you simply put COLLATE utf8_general_ci in your query. Since this is mysql specific, no generic rails support is provided to specify this via find parameters.

Hope this was useful. Vik

Some links available collations: collation charts portal run by mysql engineer http://www.collation-charts.org/ http://dev.mysql.com/doc/refman/5.0/en/charset.html http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html