Downloading a non UTF filename.

Hi,

I'm using Rails 1.2.2 and my app can upload files with special characters like ãçáéíóú.

I'm converting the filename (the physical name and not it's content!) from UTF to ISO and it's working fine. Otherwise all filenames would have UTF chars (it would work but will confuse the client).

The problem is I can't download the filename because when I try to get the name for eg.: fileção.jpg Rails will convert it to UTF: fileçà £o2.jpg and throw and Routing Error or 404 not found.

Any idea how I could convert this URL to ISO so it can find the filenames?

Thanks a lot.

Peter.

Hi,

I'm using Rails 1.2.2 and my app can upload files with special characters like ãçáéíóú.

I'm converting the filename (the physical name and not it's content!) from UTF to ISO and it's working fine. Otherwise all filenames would have UTF chars (it would work but will confuse the client).

I presume you mean ISO-latin-1 ?

The problem is I can't download the filename because when I try to get the name for eg.: fileção.jpg Rails will convert it to UTF: fileçà £o2.jpg and throw and Routing Error or 404 not found.

Scratching my head here, but aren't URLs meant to be US-ASCII only? Non-ascii characters and reserved characters must be escaped.

http://gbiv.com/protocols/uri/rfc/rfc1738.txt

Any idea how I could convert this URL to ISO so it can find the filenames?

What you need to do is convert it to US-ASCII. Or autogenerate a unique name.

Most places which support file uploads of arbitrary files will discard the original name anyway and save the file under an ID instead, or mangle the name for various reasons (XSS attacks being one of them, but name collisions is the primary reason).

Hi Richard,

Thanks a lot for your help.

Yes I meant ISO-latin-1 or ISO-8859-1.

Thanks for your ideas, I thought about maybe having a unique name for the filenames. It's just I'm frustrated because of this simple thing I can't download a simple file with accentuations!

I tried to convert to US-ASCII using Iconv.new('US-ASCII', 'UTF-8').iconv(self.filename) and the filename shows correctly in the browser but only because the page is in UTF8 and then it can't download the filename (which I saved/converted it's filename to ISO) when I click it and in the logs I see it's trying to download the filename with UTF8 characters so it will never find it!

Any other suggestions?

Thanks,

Peter.

This isn't a fault of Ruby or anything. Read the RFC. A URL cannot contain accented characters, in fact it can only contain a very limited subset of the ASCII characters (alphanumeric and some punctuation symbols).

Even though your operating system will let you use accented latin-1 and UTF characters in filenames, those filenames cannot be part of a URL. Your webserver may be able to interpret escaped characters and find the filename, but accented characters cannot be present in the URL to begin with.

Hi Richard, who said it can't have accented characters? Of course it can and even domain names now can have accented characters!!!

http://developer.mozilla.org/en/docs/Internationalized_Domain_Names_(IDN)_Support_in_Mozilla_Browsers

I have no problem doing this in any other language, just RoR has lots of weirdness when using unicode besides the great framework it is.

Tks,