I'm having an encoding problem, even though I'm attempting to have all my data always be in utf-8. Here's the path the data takes:
I have a CSV file that I believe is in utf-8, because I'm looking at it in a hex editor. One of the words is:
64 65 62 74 6F 72 E2 80 99 73
which translates to:
debtor’s [that is, it contains a smart apostrophe]
Now, I'm reading that string in using:
require 'csv' CSV.foreach(fname, :encoding => 'u') do |row|
Then I'm just storing that string using the standard ActiveRecord method.
In mysql, when I do "show table status;", I get "utf8_unicode_ci" as the encoding for all the tables.
When I try to display that data in a browser, however, I get a 500 error with the message:
"incompatible character encodings: ASCII-8BIT and UTF-8"
In the header of my page, I see:
Content-Type text/html; charset=UTF-8
I don't know where else to look.
How can I tell what exactly is stored in the database? If I go into mysql and query, I get a single question mark in place of the character, but I don't know if that is caused by a translation after the fact. In other words, I've ssh'd to the centos machine, started mysql and did a select. I get:
debtor?s
(Note: the page works correctly on my OS X development machine, using Ruby 1.8.7, so that is making debugging more challenging.)