UTF-8 as default database encoding

As you might know, the db:create rake task creates utf8 databases for MySQL, Postgres and SQLite. http://dev.rubyonrails.org/ticket/8448

The only problem is that you need to manually add the encoding to your database config file.

Because I'm lazy and I believe most people are, I submitted a patch to add the utf-8/unicode encoding to the auto generated database.yml file. A lot of people still ignore that you can define the encoding to use and I think that this patch should help.

http://dev.rubyonrails.org/ticket/8701

What do you think?

Matt Aimonetti

It's not just about being lazy, it's first and foremost about Doing the Right Thing™. When the encoding of the database is UTF-8, it's not strange to assume that the driver should also be in UTF-8 mode. Not many people know why and that you have to set the encoding in the database driver.

Manfred

AIUI, the server encoding and the client encoding (here defined by the driver), don't necessarily have to match. In cases where the client encoding doesn't match the server encoding, the server translates from the client encoding to the server encoding. For related PostgreSQL documentation, see

I'd be surprised if other database servers don't have similar functionality.

I think a more likely problem is when the HTML charset and client encoding don't match.

Of course, if the default is UTF-8 for the HTML, it makes sense to use UTF-8 in the database client (driver) as well.

Michael Glaesemann grzm seespotcode net

Manfred, 'lazy' was indeed not the best word to describe the reason behind this patch but thanks for your support.

Here is an example of what could happen if you use utf-8 data without the proper encoding set in your driver.

This is what you should see, a drop down with 3 languages named in their own language.

This is what you get if you don't set the encoding to utf-8 in your database.yml file:

And just for fun, here is the same list with the html charset set as iso8859-1

The above screenshots were taken using mysql. When using Postgresql without defining the encoding for the db driver, the data is displayed properly (in utf-8 as shown in screenshot #1).

I added he encoding setting to the postgresql template because Manfred asked for it and also because I think it's good to be consistent as much as possible.

-Matt

Patch merged-in

http://dev.rubyonrails.org/changeset/7116

Thanks to bitsweat :slight_smile: