If you use ActiveRecord::Base.connection.create_database you'll notice
that by default the created db will use latin1 encoding. I created a
plugin to handle different charset and collations (on top of helping
you with other boring DB tasks).
You can check out the early version of the plugin
svn checkout svn://rubyforge.org/var/svn/raketasks/db_tasks
(I really need to move my projects out of rubyforge, it's a real pain
to browse the code)
Anyway, if people consider that it's an important feature, I'd be glad
to submit a patch. Ohh, by the way, when you set your database with a
specific charset, tables inherit the collation and charset
automatically. Rails handles encoded queries by adding an encoding
value in the environment file. (database.yml) I have a rake tasks
which does that for you if you are too lazy to type 3 times encoding:
utf8
If you use ActiveRecord::Base.connection.create_database you'll notice
that by default the created db will use latin1 encoding. I created a
plugin to handle different charset and collations (on top of helping
you with other boring DB tasks).
Which database are you talking about specifically? I believe Rails
will use whatever the current default is. MySQL, for example, has a
installation-wide configuration file that controls that.
right, I was talking about the MysqlAdapter, the other database
adapters don't let you create a database.
Here is the create_database method:
def create_database(name) #:nodoc:
execute "CREATE DATABASE `#{name}`"
end
Here is my suggestion:
# Create a new MySQL database allowing you to specify the
charset and collation, by default the database is created with utf8
charset and utf8_bin collation
# usage:
ActiveRecord::Base.connection.create_database('charset_plugin_test',
{:charset => 'latin1', :collation => 'latin1_bin'})
def create_database(name, options = {})
execute "CREATE DATABASE `#{name}` DEFAULT CHARACTER SET
`#{options[:charset] ||= 'utf8'}` COLLATE `#{options[:collation] ||=
'utf8_bin'}`"
end
Nothing fancy but using utf8 by default means that people have to/
should update their database.yml file to add the encoding to each
environment.
Just a quick note, it's probably best to set the server-wide encoding to UTF-8. I've seen some instances, mostly with MySQL 4.x where even with SET NAMES utf8 the MySQL client wouldn't go into UTF-8 mode. Ending up with mixed charsets in one table is a pita.