ActiveRecord::Base.connection.create_database defaults to latin1

If you use ActiveRecord::Base.connection.create_database you'll notice that by default the created db will use latin1 encoding. I created a plugin to handle different charset and collations (on top of helping you with other boring DB tasks).

You can check out the early version of the plugin svn checkout svn://rubyforge.org/var/svn/raketasks/db_tasks (I really need to move my projects out of rubyforge, it's a real pain to browse the code)

Anyway, if people consider that it's an important feature, I'd be glad to submit a patch. Ohh, by the way, when you set your database with a specific charset, tables inherit the collation and charset automatically. Rails handles encoded queries by adding an encoding value in the environment file. (database.yml) I have a rake tasks which does that for you if you are too lazy to type 3 times encoding: utf8 :wink:

Matt

If you use ActiveRecord::Base.connection.create_database you'll notice that by default the created db will use latin1 encoding. I created a plugin to handle different charset and collations (on top of helping you with other boring DB tasks).

Which database are you talking about specifically? I believe Rails will use whatever the current default is. MySQL, for example, has a installation-wide configuration file that controls that.

right, I was talking about the MysqlAdapter, the other database adapters don't let you create a database.

Here is the create_database method:

      def create_database(name) #:nodoc:         execute "CREATE DATABASE `#{name}`"       end

Here is my suggestion:

      # Create a new MySQL database allowing you to specify the charset and collation, by default the database is created with utf8 charset and utf8_bin collation       # usage: ActiveRecord::Base.connection.create_database('charset_plugin_test', {:charset => 'latin1', :collation => 'latin1_bin'})       def create_database(name, options = {})         execute "CREATE DATABASE `#{name}` DEFAULT CHARACTER SET `#{options[:charset] ||= 'utf8'}` COLLATE `#{options[:collation] ||= 'utf8_bin'}`"       end

Nothing fancy but using utf8 by default means that people have to/ should update their database.yml file to add the encoding to each environment.

Does it make sense?

-Matt

To me it does. It’s inexpensive and exposes native database functionality to the framework user.

Just a quick note, it's probably best to set the server-wide encoding to UTF-8. I've seen some instances, mostly with MySQL 4.x where even with SET NAMES utf8 the MySQL client wouldn't go into UTF-8 mode. Ending up with mixed charsets in one table is a pita.

Manfred

Somebody from the core team? Do you want a patch or sould I keep that in my plugin?

m>a

Please do patch. Let's encourage Unicode end-to-end.

jeremy

Thanks Jeremy,

Here is the ticket: http://dev.rubyonrails.org/ticket/8448

if the patch is used, one can start a rails app just like that:

rake my_project rake db:create ruby script/server

That's it! ready to go. (compatible with PostgreSQL, MySQL, and SQLite3)

One thing though, we might want to add the following key/value: encoding: utf8 to each default environment from database.yml

- Matt