Long Japanese filenames don't download correctly in Safari

Hello all, I’m running into some trouble with ActiveStorage and S3. It seems to be when teachers upload files with long Japanese names.

The Safari download ends up naming the file with the S3 service key.

It could be a bug with Safari, but I’m wondering if anyone can lend some technical insight?

My guess is something about escaping all the kanji characters makes the name very long for the redirect download url. If I jump into the database and change the attachment file name to be shorter, it starts working correctly.

I’d like to pinpoint the issue to submit bug reports or fixes to any parties necessary.

Thanks!

curl -v https://<REMOVED>/rails/active_storage/blobs/redirect/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaEpJaWxrT0dZd016WXpOaTFqWm1NeUxUUmlNV1F0WWpnMU1TMDRPRFV6TXpnMU4yTTNaR1lHT2daRlZBPT0iLCJleHAiOm51bGwsInB1ciI6ImJsb2JfaWQifX0=--887918f1bf59181149800e483e7f9a7bde2c71bc/21秋5レベル漢字の宿題_第6週1日目〜3日目.pages

< HTTP/1.1 302 Found
< Server: Cowboy
< Connection: keep-alive
< X-Frame-Options: SAMEORIGIN
< X-Xss-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< X-Download-Options: noopen
< X-Permitted-Cross-Domain-Policies: none
< Referrer-Policy: strict-origin-when-cross-origin
< Date: Wed, 27 Oct 2021 07:29:25 GMT
< Location: https://s3.ap-northeast-1.amazonaws.com/<REMOVED>/ah1yvfb5rwqb8ggfuqsbwjvd7ese?response-content-disposition=attachment%3B%20filename%3D%2221%253F%253F%253F%253F%253F%253F%253F%253F%253F%253F_%253F%253F%253F1%253F%253F%253F%253F%253F%253F.pages%22%3B%20filename%2A%3DUTF-8%27%2721%25E7%25A7%258B%25EF%25BC%2595%25E3%2583%25AC%25E3%2583%2599%25E3%2583%25AB%25E6%25BC%25A2%25E5%25AD%2597%25E3%2581%25AE%25E5%25AE%25BF%25E9%25A1%258C_%25E7%25AC%25AC%25EF%25BC%2596%25E9%2580%25B11%25E6%2597%25A5%25E7%259B%25AE%25E3%2580%259C%25EF%25BC%2593%25E6%2597%25A5%25E7%259B%25AE.pages&response-content-type=application%2Fvnd.apple.pages&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5NXJP2P5DVZT4ZPT%2F20211027%2Fap-northeast-1%2Fs3%2Faws4_request&X-Amz-Date=20211027T072925Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=d25cea933248f3715b1d3098bf02e4c34694b12af708f29d12551e9b784f561c
< Content-Type: text/html; charset=utf-8
< Cache-Control: max-age=300, private
< X-Request-Id: dbc03c9a-89ec-45e7-a62f-84b8c33aeb1d
< X-Runtime: 0.005716
< Strict-Transport-Security: max-age=63072000; includeSubDomains
< Transfer-Encoding: chunked
< Via: 1.1 vegur

Newest versions of Safari on iOS and macOS are ok but wondering if there is something I can do to fix the problem other than instruct teachers to shorten their file names.

Hi Lee, thank you for reporting this problem.

We need your help to track this down.

We need to track down the problem, if this a problem in rails, or if it is a problem with Safari, or with some machinery inbetween.

It is believed, that rails does generate valid urls. Please give us an example of the failing url.

There is also the webserver involved. We have to check if the generated url was wrong, or if the webserver does not understand it.

Unicode is a tricky space. More so with the urls.

Do not expect that a japanese String would match an url, back and forth. This does not even work for european languages.

If you want to address some resource, give it a integer/bigint key.

Unicode strings are not good for addressing a resource. This will always fail, how hard you ever will try.

But you want the speaking urls? Put it in the routes:

https://mysite/resource-path/$id/unicode-tag Just rely on the $id for resolving the request, simply ignore the unicode-tag

As an added bonus, you might want to make sure, the the unicode-tag is always there and is always the same, for all the links that you show (while it is not relevant for resolving the link)

This helps a lot with google et al, to tell them, that this is a unique resource.

Maybe speaking urls might be a thing from the past, and we should really address resources by id only. But it helps when reading the log files.

1 Like

Thank you for your reply.

The URLs are in my curl example above.

  1. First a call to the active storage blobs redirect controller which ends with the filename (generated by rails) 21秋5レベル漢字の宿題_第6週1日目〜3日目.pages
  2. The response sends us to amazon with an encoded filename seemingly in multiple formats filename%3D%2221%253F%253F%253F%253F%253F%253F%253F%253F%253F%253F_%253F%253F%253F1%253F%253F%253F%253F%253F%253F.pages%22%3B%20filename%2A%3DUTF-8%27%2721%25E7%25A7%258B%25EF%25BC%2595%25E3%2583%25AC%25E3%2583%2599%25E3%2583%25AB%25E6%25BC%25A2%25E5%25AD%2597%25E3%2581%25AE%25E5%25AE%25BF%25E9%25A1%258C_%25E7%25AC%25AC%25EF%25BC%2596%25E9%2580%25B11%25E6%2597%25A5%25E7%259B%25AE%25E3%2580%259C%25EF%25BC%2593%25E6%2597%25A5%25E7%259B%25AE.pages
  3. For some browsers or newest version of Safari this works and the file is downloaded with the correct filename, but for some versions of Safari something fails and the filename used is just the service key without any extension.
  4. Entering the database and chopping off part of the filename fixes the problem, which means most likely it’s about the length of the escaped filename in the url? Why the filename twice?

Hi Lee, we already had this some days before.

My suggestion was, to address the resources with an integer numbered key.

unicode strings are not good for a resource address. even the same string might be different because of the encoding rules.

#9 (a song by the beatles) https://www.youtube.com/watch?v=SNdcFPjGsm8&t=91s

~eike

Sorry, either I’m not understanding you or you’re not understanding me.

I’m using Active Storage. I’m not generating any urls myself. The problem is not the url to rails.

The problem is (as far as I can tell) either

  1. what rails creates to point to Amazon or
  2. what Amazon returns

And not supplying the file name would defeat the download purpose.