Clearly it is a mysql issue, I plead ignorance when it comes to the intricacies of collations in mysql
but since I may not be the only one, here is a little tutorial.
Mysql allows charsets and collations on various levels (server, database, table, column)
that determine ordering and conflation (equality) of strings.
Though even the server level can be dynamically set, once it is set for a db/table, resetting
collation for serve or database will not influence the queries.
for ut8, there are two collations,
- utf8_bin which is sensitive to all differences
- utf8_general_ci (ci = case insensitive, but incidentally this collation is also accent insensitive)
there is NO collation which is case sensitive but accent insensitive or the other way round.
The default collation for utf8 is utf8_general_ci
(Note that accent insensitivity for this means that there is inconsistency in the naming of collations, since
for instance latin1_general_ci is case sensitive, see http://bugs.mysql.com/bug.php?id=19567 )
In rails you can easily change that default on the database level simply by setting
and rake will also take this into account for tasks like db:create
ActiveRecord::Base.connection.execute “SET collation_database = ‘utf8_bin’”
only have an effect for creating new tables.
You can also manipulate the connection (server-client communication) encoding
dynamically, however, importantly
For comparisons of strings with column values, collation_connection does not matter because columns have their own collation,
which has a higher collation precedence.
Most relevantly if you want to dymanically manipulate collation for queries you simply put
in your query.
Since this is mysql specific, no generic rails support is provided to specify this via find parameters.
Hope this was useful.
collation charts portal run by mysql engineer