Uniqueness Validator and Case Sensitivity

Hi,

I would like to ask what is the reason behind the fact that
ActiveRecord Uniqueness Validator does the following:

if value.nil? || (options[:case_sensitive] || !column.text?)
  sql = "#{sql_attribute} #{operator}"
else
  sql = "LOWER(#{sql_attribute}) = LOWER(?)"
end

In other words, why it sends the 'LOWER" when case sensitivity is set
to false (for string attributes)

This generates sql queries that do not use the index on the attribute.

My opinion is that ActiveRecod should just pass the requirement (case
sensitive or not) to adapter (e.g. mysql2) and let the
adapter decide how to do the job/construct the query (which would
probably take into account db colation or do nothing and let db decide
how to handle it)

Moreover, I find it inconsistent the fact that default behaviour
of :uniqueness validator is case sensitive TRUE when finders are all
case insensitive "don't care". Maybe, as a workaround, except from
false and true, you can consider something like :uniqueness =>
{:case_sensitive => :db} and the db colation decide about it.

Currently, I have decided to stop using :uniqueness validator on
string type attributes, because of all this mess with MySQL colation.

Panayotis

MySQL is the only database I’m aware of for which select 'Bla' = 'bla'; returns true. So while it looks odd from a MySQL user’s perspective to perform LOWER on both sides, it’s the correct way to do case insensitive matching on a SQL92 database. Also, on databases like PostgreSQL and Oracle at least, you can use functional indexes (e.g. CREATE INDEX users_lower_email_idx ON users (LOWER(email))) to make these lookups fast.

For your case, you should probably just not use the :case_insensitive flag, and know that MySQL is going to compare insensitively anyway. Even though aesthetically I’d like to see the Rails MySQL adapter behave consistently with the other databases (be case sensitive by default), I think it might be quite a rabbithole if Rails tried to force MySQL to do string comparison case sensitively because of MySQL’s lack of support for functional indexes. From a quick browse, there appears to be no way to perform indexed string equality lookups on the same column both case sensitively and insensitively – you’d have to create a pair of columns, one with case sensitive collation, the other without, in order to efficiently look up the content both ways. I’m not a MySQL expert though.

-john

Let us not compare RDBMS in this thread, because it is irrelevant.
Almost all the databases allow you to define the COLLATION at creation
time.
(except ORACLE which has core problem to support this, as far as I
know. I may be wrong though).
PostgreSQL does, MySQL does, MS SQL Server does...to name a few.

So it is not that X RDBMS does this and Y RDBMS does the other.
Hence, your statement "MySQL is the only database ....that does...."
is not
correct. If somebody sets the COLLATION at database creation to be CI
(Case Insensitive)
then `select .....where ....'Bla' = 'bla';` will return true. If the
creator of the database
sets the COLLATION to be CS (Case Sensitive), then even MySQL will
return false.

ALL modern database management systems do that. And they have VERY
GOOD reasons
for doing that (which is out of the scope of this thread).

What I am trying to say here is that Rails/ActiveRecord SHOULD leave
this responsibility
(uniqueness validation with case or not sensitivity) to the db level,
or AT LEAST let the user of
ActiveRecord decide whether to use this ActiveRecord:uniqueness
validation feature or not.
Currently Rails OBLIGES the developer to USE IT (when
using :uniqueness validator),
by either setting this value to TRUE (the default value
of :case_sensitive option) or FALSE.
There should be another one extra value to this option, something like
"IGNORE" or "DB" or something
that will pass this responsibility to the database level (or db
adapter level).

In summary,

*) I find it a serious design flaw that this is missing.
*) Why the same feature (:case_sensitive => true/false) is not present
on finders? BTW, It is correct that it does
not.

All the other technical details to overcome this shortcoming are just
workarounds, the better of
which, according to my opinion is just to not use it and write your
own :uniqueness validator for
string column types.

-- Panayotis

I suppose I should have said “by default.” Apologies for the flame bait.

On a more useful note, master appears to have already changed this behavior for 3.2:

https://github.com/rails/rails/commit/c90e5ce779dbf9bd0ee53b68aee9fde2997be123

There’s no “let the DB decide my case sensitivity” option, but you at least get consistent behavior across databases, and you get to use indexes if you make the right decision when defining your field’s collation up front.

-john

Ahhhhh. Ok. Yes that's good. Thanks a lot.
Will be waiting for the 3.2.

Thanks again
-- Panayotis