Uniqueness Validator and Case Sensitivity

Hi,

I would like to ask what is the reason behind the fact that ActiveRecord Uniqueness Validator does the following:

if value.nil? || (options[:case_sensitive] || !column.text?)   sql = "#{sql_attribute} #{operator}" else   sql = "LOWER(#{sql_attribute}) = LOWER(?)" end

In other words, why it sends the 'LOWER" when case sensitivity is set to false (for string attributes)

This generates sql queries that do not use the index on the attribute.

My opinion is that ActiveRecod should just pass the requirement (case sensitive or not) to adapter (e.g. mysql2) and let the adapter decide how to do the job/construct the query (which would probably take into account db colation or do nothing and let db decide how to handle it)

Moreover, I find it inconsistent the fact that default behaviour of :uniqueness validator is case sensitive TRUE when finders are all case insensitive "don't care". Maybe, as a workaround, except from false and true, you can consider something like :uniqueness => {:case_sensitive => :db} and the db colation decide about it.

Currently, I have decided to stop using :uniqueness validator on string type attributes, because of all this mess with MySQL colation.

Panayotis

MySQL is the only database I’m aware of for which select 'Bla' = 'bla'; returns true. So while it looks odd from a MySQL user’s perspective to perform LOWER on both sides, it’s the correct way to do case insensitive matching on a SQL92 database. Also, on databases like PostgreSQL and Oracle at least, you can use functional indexes (e.g. CREATE INDEX users_lower_email_idx ON users (LOWER(email))) to make these lookups fast.

For your case, you should probably just not use the :case_insensitive flag, and know that MySQL is going to compare insensitively anyway. Even though aesthetically I’d like to see the Rails MySQL adapter behave consistently with the other databases (be case sensitive by default), I think it might be quite a rabbithole if Rails tried to force MySQL to do string comparison case sensitively because of MySQL’s lack of support for functional indexes. From a quick browse, there appears to be no way to perform indexed string equality lookups on the same column both case sensitively and insensitively – you’d have to create a pair of columns, one with case sensitive collation, the other without, in order to efficiently look up the content both ways. I’m not a MySQL expert though.

-john

Let us not compare RDBMS in this thread, because it is irrelevant. Almost all the databases allow you to define the COLLATION at creation time. (except ORACLE which has core problem to support this, as far as I know. I may be wrong though). PostgreSQL does, MySQL does, MS SQL Server does...to name a few.

So it is not that X RDBMS does this and Y RDBMS does the other. Hence, your statement "MySQL is the only database ....that does...." is not correct. If somebody sets the COLLATION at database creation to be CI (Case Insensitive) then `select .....where ....'Bla' = 'bla';` will return true. If the creator of the database sets the COLLATION to be CS (Case Sensitive), then even MySQL will return false.

ALL modern database management systems do that. And they have VERY GOOD reasons for doing that (which is out of the scope of this thread).

What I am trying to say here is that Rails/ActiveRecord SHOULD leave this responsibility (uniqueness validation with case or not sensitivity) to the db level, or AT LEAST let the user of ActiveRecord decide whether to use this ActiveRecord:uniqueness validation feature or not. Currently Rails OBLIGES the developer to USE IT (when using :uniqueness validator), by either setting this value to TRUE (the default value of :case_sensitive option) or FALSE. There should be another one extra value to this option, something like "IGNORE" or "DB" or something that will pass this responsibility to the database level (or db adapter level).

In summary,

*) I find it a serious design flaw that this is missing. *) Why the same feature (:case_sensitive => true/false) is not present on finders? BTW, It is correct that it does not.

All the other technical details to overcome this shortcoming are just workarounds, the better of which, according to my opinion is just to not use it and write your own :uniqueness validator for string column types.

-- Panayotis

I suppose I should have said “by default.” Apologies for the flame bait.

On a more useful note, master appears to have already changed this behavior for 3.2:

https://github.com/rails/rails/commit/c90e5ce779dbf9bd0ee53b68aee9fde2997be123

There’s no “let the DB decide my case sensitivity” option, but you at least get consistent behavior across databases, and you get to use indexes if you make the right decision when defining your field’s collation up front.

-john

Ahhhhh. Ok. Yes that's good. Thanks a lot. Will be waiting for the 3.2.

Thanks again -- Panayotis