Hosting large static assets - best practices when bound to Rails

We want to add a bunch of large static assets to our app. These assets belong to a page for our research paper. They will live under the path /research/denoiser. It’s a completely static page with an index.html.

Normally we’d host this on Github Pages, but it would be great to have our URL as a path in our regular Rails app. We’re hosting our Rails app on Heroku. We could add everything to our Git repo and put it into the public/ folder, but the page and its assets are almost 400Mb large, and Heroku doesn’t support Git LFS. If I was hosting this on a regular VM with nginx or Apache, I’d just have nginx serve the files directly and leave the static files outside of Rails.

However, with our Heroku setup, it has to go through Rails.

Is there a way to have Rails redirect all calls to some static asset storage, like S3? The main goal is that the URL in the browser still shows ourrailsdomain.com/research/denoiser, and doesn’t redirect to a different domain.

Rails will let you configure a static resource server, but it’s all or nothing (basically it replaces your public folder with S3 or similar). That may work for you. It’s very minimally documented here: Configuring Rails Applications — Ruby on Rails Guides and you may find some more real-life examples by searching on ‘rails asset_host’ or similar. That seems to be the thing to do when you’re on a restricted hosting setup. It may also be just a good idea if your web server is busy doing more Rails-y things in coordination with your application server. Now this doesn’t speak to how you get your assets into that CDN, so there’s going to be some tooling to set up in that area. You’ll also have to do some work to ensure that everything that you expect to change over time has a fingerprint so it can live with the nature of a CDN (basically no cache expiration date). You may want to just proactively put everything in the asset pipeline, so it will be fingerprinted from the start.

Now as far as getting the URL to stay the same, I don’t think you can do that entirely in this setup. You can get very close – these assets can be hosted by a CDN but appear to be in a subdomain of your surrounding site – but it’s going to mean that assets.example.com and rails.example.com are many miles apart, owned by different people if those assets are on a CDN.

If you wanted to tie up a thread of your application server for the duration of 400MB each time someone clicked on one of those downloads, you could create a proxy route in your Rails app. In that way, the URL would start with rails.example.com. That also has a very good chance of timing out on Heroku, depending on the network performance of the person who clicked (so putting this firmly into the corner of “designed to DDOS your app”).

A middle path might be to have a redirect route for those files, which would appear (if you hovered on the link in your page) to be in rails.example.com/research/denoiser but when clicked would engage your app (up to the level of routes.rb) where it would transform into a 301 redirect to some.cdn.example.net/bucket/name/denoiser.whatever.

Give those ideas a play, and see what makes the most sense to you.

Walter

How about using ActiveStorage?

Thanks a lot for the replies. I ended up writing a controller that fetches (and caches) the index.html from S3 and serves it directly to the user, but redirects to S3 for every other request in the folder. That way, the browser URL remains the same for the user (there are no sub pages, just images and audio examples served via a single index.html).

I had to take extra care to make sure that the controller redirects to a URL with a trailing slash when serving the index.html - otherwise the relative paths in the index.html fail. For some reason, with_options trailing_slash: true do ... end in my routes.rb didn’t work.

This setup won’t win the price for elegance, but it works, and I think it’s quite DDOS safe.