1. Introduction
In my company, we lucked out that Rails 5.2 (and Active Storage) was released just before we needed to implement user uploads. This means that we’ve been using it in production for over 5 years, across 3 different hosts (Heroku, AWS, and GCP) and 3 different storage providers (S3, GCS, and R2).
Our primary use case is image galleries, either of products we sell, or user-uploaded images for their wedding/birthday/baby shower websites. This means that we rely heavily on image transformations, which Active Storage makes trivial to generate.
While we love Active Storage (which I’ll call AST from now on), there are certain design decisions in AST that one should be aware of, as they have an impact on the overall performance of the app, and not just the pages that use AST.
2. Understanding Active Storage
2.1 The basic use case we will use as an example
To understand some of the things I’ll talk about below, we must first take a look at how active storage works under the hood. Let’s assume you have a Company
model and want to use AST to store its logo:
class Company < ApplicationRecord
has_one_attached :logo
end
You can attach a file by doing this:
@company.logo.attach(io: File.open(Rails.root.join("public/favicon-192.png")), filename: 'logo.png')
And then display an optimized version of it, which AST calls “variant”, by doing this:
<%= image_tag @company.logo.variant(resize_to_limit: [200, 200], saver: { strip: true, compression: 9 }) %>
Open the page and you will see your company logo has been resized to be 200px in width and 200px in height, and compressed to reduce data usage. No need for extra columns in your company table. Just say you want a variant and you have it.
2.2 Tables and columns created by Active Storage
When you install active storage, it’s going to create three new tables in your database.
2.2.1 active_storage_blobs
This table contains information about the file you have just attached:
key
: Name your file will have in the storage serice. It’s generated by thegenerate_unique_secure_token
method inActiveStorage::Blob
. It contains only numbers and lowercase letters;filename
: Name your file originally had when you attached it;content_type
: Extracted by the marcel gem and based on binary data, declared type and filename;metadata
: Extracted by theanalyze
method inActiveStorage::Blob
. It can tell various things about the file, like height, width, rotation, etc.;service_name
: The storage service your file was uploaded do. This was not part of the original implementation, but was added later to support having a different service per attachment;byte_size
: Size of the file in bytes;checksum
: You will never have to worry about this one. AST uses it to ensure file is not corrupted.
2.2.2 active_storage_attachments
This is the join table used to connect the blobs (files) to your models.
blob_id
: Reference toactive_storage_blobs
;record_type
andrecord_id
: Polymorphic reference to your model. In our examplerecord_type = 'Company'
;name
: Thehas_one_attached
attribute. In our examplename = 'logo'
.
2.2.3 active_storage_variant_records
This table tracks which variants have already been generated. It was not part of the original implementation, but was added later as an optimization to avoid having to check if the variant file existed in the storage service.
blob_id
: Reference toactive_storage_blobs
. This is NOT the variant’s blob, but the original blob;variation_digest
: Generated from the options we gave the.variant
method that created this record. In our example, this would be the digest of the hash{ resize_to_limit: [200, 200], saver: { strip: true, compression: 9 } }
.
2.3 How it all works under the hood
2.3.1 When you attach an image
- AST creates a new record in
active_storage_blobs
(B1); - AST creates a new record in
active_storage_attachments
(A1) connecting the newly created blob (B1) to the company; - AST uploads the file to the storage service, using the name generated by the blob (B1);
- AST enqueues an
ActiveStorage::AnalyzeJob
to process the blob (B1) and extract metadata from it.
2.3.2 When you display an image
When you display the original file (not a variant) in an image_tag
:
- AST looks for an
active_storage_attachment
(A1) for your company, withname = "logo"
; - AST looks for the
active_storage_blob
(B1) the attachment references; - AST generates a URL that points to one of its own controllers, not the storage:
/rails/active_storage/blobs/:signed_id/*filename
.
When you display a variant in an image_tag
:
- AST looks for an
active_storage_attachment
(A1) for your company, withname = "file"
; - AST looks for the
active_storage_blob
(B1) the attachment references; - AST generates a URL that points to one of its own controllers, not the storage:
/rails/active_storage/blobs/representations/:signed_id/:variation_key/*filename
.
When the browser requests the URL of the original image:
- The request is routed to
ActiveStorage::BlobsController#show
(pre Rails 6.1, there are now two possible controllers, see in 2.3.3); - AST looks for the
active_storage_blob
(B1) in the signed id; - AST redirects the request to the URL of the generated file in the storage service.
When the browser requests the URL of the variant image for the first time:
- The request is routed to
ActiveStorage::RepresentationsController#show
(pre Rails 6.1, there are now two possible controllers, see in 2.3.3); - AST looks for the
active_storage_blob
(B1) in the signed id; - AST decodes the
variation_key
into a hash of options (in our example it would be{ resize_to_limit: [200, 200], saver: { strip: true, compression: 9 } }
); - AST checks in
active_storage_variant_records
and finds nothing, which means it has not processed this variant yet; - AST downloads the file from storage;
- AST creates a new record in
active_storage_variant_records
(V1, which references the original blob); - AST creates a new record in
active_storage_blobs
(B2) for the variant it’s about to generate; - AST creates a new record in
active_storage_attachments
(A2) which references the new blob (B2) and the variant record (V1); - AST passes to the
image_processing
gem the options from step to 2 and the downloaded file, causing it to generate a new file; - AST uploads the newly created file to the storage using the
key
specified in the blob (B2); - AST redirects the request to the URL of the generated file in the storage service.
When the browser requests the URL of the variant image after the first time:
- The request is routed to
ActiveStorage::RepresentationsController#show
; - AST looks for the
active_storage_blob
(B1) in the signed id; - AST decodes the variation key into a hash of options (in our example it would
{ resize_to_limit: [200, 200], saver: { strip: true, compression: 9 } }
); - AST checks in
active_storage_variant_records
(V1) and finds it; - AST looks for an
active_storage_attachment
(A2) that references the variant record (V1); - AST looks for the
active_storage_blob
(B2) the attachment references; - AST returns the URL of the generated file in the storage service.
2.3.3 Serving files through rediret mode, proxy mode and public mode
Until Rails 6.1 Active Storage could only serve images using redirect mode. That is, it generated a URL to one of its own controllers (blob for original files, representations for variants), and that controller redirected the request to a signed, expirable URL to the actual file in storage. While this worked fine for private files, it meant if you were serving public images your CDN could not cache them.
Thanfully, two new modes were added by contributors:
- Proxy mode: Instead of redirecting the request to the storage, AST will stream the file requested. This allows your CDN to cache your images, but keeps one of your puma workers busy while the file is being streamed.
- Public mode: This changes the url generated so that instead of pointing the an AST controller, it points directly to the file in storage. This means no extra load on your app when image is requested, but extra load (in the form of queries) to generate the URL of variants when the view is being generated.
Here’s a table explaning each controller used and URL generated:
mode | image | controller | url |
---|---|---|---|
redirect | original | ActiveStorage::Blobs::RedirectController | /rails/active_storage/blobs/redirect/:signed_id/*filename |
redirect | variant | ActiveStorage::Representations::RedirectController | /rails/active_storage/representations/redirect/:signed_blob_id/:variantion_key/*filename |
proxy | original | ActiveStorage::Blobs::ProxyController | /rails/active_storage/blobs/proxy/:signed_id/*filename |
proxy | variant | ActiveStorage::Representations::ProxyController | /rails/active_storage/representations/proxy/:signed_blob_id/:variantion_key/*filename |
public | original | - | varies depending on the storage service (S3, Google Storage, etc.) |
public | variant | - | varies depending on the storage service (S3, Google Storage, etc.) |
2.3.4 The direct upload flow
When allowing your user to upload photos, there are two ways to use to built-in file method in the default builder:
<%= form.file_field :attachments %>
<%= form.file_field :attachments, direct_upload: true %>
The first one is easier to use. It will upload the file to your app, and then your app will upload it to the storage service. The second one is more complex because it requires you to configure CORS in your storage and if you are building an SPA you will probably need to write custom javascript code to handle the upload, but it has the advantage of not requiring your app to handle the file upload (I will discuss why that is important in 2.4).
When using the direct upload method, the flow is as follows:
- On form submission Turbo notices the
direct_upload
instruction and halts the submission; - Turbo sends a POST request to
ActiveStorage::DirectUploadsController#create
; - AST creates a new record in
active_storage_blobs
(B1) for the file; - AST returns a JSON containing information about the blob and a signed, secure URL that Turbo should use to upload the file;
- Turbo uploads the file to storage and inserts the
signed_id
it got from AST in the value of the file field; - Turbo resumes the form submission;
- When the apps controller receives the params and uses them to save/create the model, AST grabs the value it got in the file field and uses that to find the blob (B1);
- AST creates a new record in
active_storage_attachments
(A1) which references the blob (B1) and the model; - AST enqueues the
ActiveStorage::AnalyzeJob
to process the blob (B1) and extract metadata from it.
2.3.5 The PNG fallback
Among the many config options in AST, there are two that are important if you are displaying images:
web_image_content_types
: The image formats you are willing to serve to your users. By default it contains onlypng
,jpeg
andgif
.variable_content_types
: The image formats the version of libvips/image_magick installed in your servers can handle.
Together these two control AST behaviour when you ask it to display the original version of the image. If the image uses a format listed in web_image_content_types
, then AST will serve it as-is. Otherwise, it will check if the format is listed in variable_content_types
and if yes, it will convert it to PNG, since it believes that your image library can handle the conversion. Finally, if AST does not find the format in either list, it will serve the image as a binary file.
In practice, there are two changes you need might need to make to these lists:
- If you want to serve
webp
images, you need to add it toweb_image_content_types
; - If you are using a version of image_magick/libvips from your distro repos, there’s a pretty good chance it does not support converting some of the formats listed in
variable_content_types
, so you should remove them, otherwise your exception handling service (Sentry, Honeybadger, etc.) is going to get flooded with errors.
2.3.6 The analyze job
Every time a file is attached to a model, AST will enqueue an instance of ActiveStorage::AnalyzeJob
, which in turn will call the analyze
method on the blob. This method is used to extract metadata from the file, and it does that by passing the blob to one of the various analyzers available by default in AST:
ActiveStorage::Analyzer::ImageAnalyzer
: Extracts width and height;ActiveStorage::Analyzer::VideoAnalyzer
: Extracts width, height, duration, angle, aspect ratio and if the file contains audio and video channels;ActiveStorage::Analyzer::AudioAnalyzer
: Extracts duration, bit rate and sample rate;
While these are usually good enough, you can easily write your own analyzer to extract more information from the file. For example, let’s say you want to know if a file is transparent to figure out if it’s safe to convert a PNG into JPG to reduce its size, and you are using libvips:
class MyAnalyzer < Analyzer::ImageAnalyzer::Vips
def metadata
read_image do |image|
if rotated_image?(image)
{ width: image.height, height: image.width, opaque: opaque?(image) }.compact
else
{ width: image.width, height: image.height, opaque: opaque?(image) }.compact
end
end
end
private
def opaque?(image)
return true unless image.has_alpha?
image[image.bands - 1].min == 255
rescue ::Vips::Error
false
end
end
Then just add it to the analyzer config array:
Rails.application.config.to_prepare do
Rails.application.config.active_storage.analyzers.prepend MyAnalyzer
end
2.4 What all of this means for your app
Now that we know how AST particular brand of magic works under the hood to allow attachment images and creating variants without needing new columns or tables, here’s a few rules of thumb and things to watch out for, no matter if you are handling images or every type of file.
2.4.1 You have to very careful about N+1 queries when displaying multiple records with attachments
As we saw above, to display the logo of your company, AST has to make to extra queries, first to find the attachments (the join table), and then the blob that is the actual file. To avoid that, you can either use the available scope or do it by hand:
@companies = Company.all.with_attached_logo
@companies = Company.all.includes(:logo_attachment, :logo_blob)
To make it clear, the pattern is with_attached_ATTRIBUTE
and includes(:ATTRIBUTE_attachment, :ATTRIBUTE_blob)
, where ATTRIBUTE
is the name you gave to has_one_attached
.
2.4.2 Always use the direct upload version of file upload to protect your app from slow clients
When a user is uploading a file to your server, instead of directly to storage, it is keeping one off your puma workers busy, and preventing it from serving other requests. If the user has a slow connection, or is uploading a large file, this can take a long time.
If you are running with only a few web servers with low concurrency (say, you are in Heroku, running 2 standard 2x web dynos) you might only have 4 of those workers. If two users are “slow clients”, you just halved your capacity for as long as they are uploading their files. Which means that your other users might take longer to navigate between pages, causing frustration and a bad experience.
2.4.3 If you are using proxy mode, make sure you have nginx or cloudflare between your servers and your users to protect your app from slow clients
Same reasoning as above. Proxy mode in AST keeps a puma worker busy while the file is streamed from storage to the user. If the user is in a slow connection, that might keep your worker busy for a long time.
By having nginx or cloudflare in front of your app, you can configure them to buffer the stream allowing your worker to quickly send them the entire file and start serving other requests, while they handle the slowness of the client. Something they are much better equipped to do than Puma.
2.4.4 The on demand variant generation is a great feature, but it can bring your entire app to its knees if you are not careful.
Once again, let’s assume your app is small, or you prefer to scale horizontally, so your servers only have 1-2 vCPUs and not much memory (say 1GB). This means puma is probably configured for 2-3 workers and your app is using 80-90% of the memory.
In your app there’s a page where you allow users to upload multiple photos, and after the upload is done, they are redirected to their gallery, where you display smaller, compressed versions of their photos.
One of your users just uploaded 10 photos at once. After the redirect, their browser will request the smaller, compressed version of those 10 photos. Since these variants have never been generated, your servers will download and process them all at the same time. This means that you will have multiple variant generations competing for their limited CPU time, and eating whatever they had left of memory, causing them to start swapping.
This will cause a major slowdown not just for the image generationg requests, but for every other requests as they now struggle to get CPU time to execute their code, and have to use swap memory to allocate objects. I remember our APM showing pages with sub 100ms response times going past 200-300ms because even a simple .find
was taking 50ms while image magick was processing a PNG.
Also, if you had less then 10 workers total, some of those image requests will be waiting in the load balancer’s queue for chance to be generated. And right behind them will be all the other navigation requests of your other users.
There are a few ways to mitigate this:
- Replace image magick with libvips. It’s faster and uses less memory;
- Scale your servers vertically instead of horizontally. It reduces the chance of all workers in a single servers being occupied and the extra vCPUs will process the image faster;
- If you are close to the memory ceiling give your servers more memory to avoid swapping. In AWS this means switching families (c6i to m6i) and on GCP using a custom machine to add 1GB ram.
- Generate your variants in advance, through a background job.
3. Running in production to serve images
Now that we understand how AST works, and the impact its design decisions have on your app, let’s talk about some lessons we learned from running it in production.
Warning: This part is a bit of a rant, and parts of it are about handling images in general, not AST specifically.
ImageMagick is too slow, uses too much memory and has too many security issues…
That’s why Rails 7 switched to libvips by default. If you want a more in-depth explanation, check my Make Vips the recommended/default variant processor for Active Storage thread here in the forum and compare the list of known CVEs for vips and known CVEs for ImageMagick.
… and the version of the image libraries in your distro’s repos is too old and doesn’t support enough file formats
This was one of the main drivers for us to leave Heroku. We were stuck using 18.04 LTS and had no way to try to get a more recent version of libvips installed. Fortunately, this is less of a problem now in 2023, where every LTS distro has a recent enough version of libvips in its repos.
What’s still a problem is file format support. The packages available in all distros support JPEG, PNG, BMP, TIFF, GIFs and, if you install the right package, WEBP. And that’s it. But you know users… gotta catch 'em all. Users can and will upload HEICs, AVIFs, JPEG2000, JPEGXL, and whatever else they can find.
So, how do you support those formats? You install the dev packages of every file format you need, then you compile your image library (ImageMagick or libvips) from source so that they can properly link to said packages.
Yes, compile from source.
And since you’re already doing that, I recommend you also remove libpng and libjpeg-turbo from your system and replace them with the faster libspng and the much better mozjpeg (this last one is important. I’ll talk about it later).
Your users will laugh at your poor attempts to restrict image formats using accept='image/jpeg,image/png'
and upload whatever they want…
I haven’t figured out how they do it, but it seems that one of the Android browsers completely ignores the accept
attribute (or at least the accepted image formats) and allows the user to upload whatever they want. It’s one of the reasons we’ve had to add support for so many file formats in our image libraries.
… and as a bonus, just because it’s a video file it does not mean it contains a sound stream… or a video stream
Yup. Android’s voice recorder uses a file format that is identified as video. So if you’re doing video manipulation in your web app, be aware you might get things you think are a video, but they don’t actually have a video channel, so make sure you check your blob’s metadata
Your CDN will not keep your images in their cache despite what they say…
You checked Cloudflare’s documentation and ran a few tests and noticed that the first two times you request an image you get a cache MISS, and the request hits your app. But on the third time, you get a HIT and the request does not hit your app. So you think you’re good, right?
Wrong.
All this means is that the data center (PoP) serving your requests has it in its cache. But CDNs don’t have a single data center. They have dozens, all around the world. Just in the US, Cloudflare has 46 of them. And unless you’re paying for a premium plan (or addon like Argo) where the cache is automatically propagated between PoPs, each one of them is going to make two requests for that image.
So just in the US, that’s 92 requests that will hit your Puma servers before the image is fully cached. Then all is well, right?
Wrong again.
You see, just because the image is in the cache, it doesn’t mean it will stay there. No, it does not matter that you set its TTL for 1 month since retention and freshness are different things. If your image is not requested often enough in a specific PoP, it will get evicted. Which means that PoP will let two more requests hit your servers.
Sure, your JS and CSS files, as well as your logo and maybe the images on your home page, have enough requests to keep them in the cache. But if you’re running an image-heavy web app? You’re going to have a bad time. If you’re running with only a few servers, you will definitely feel it in your request queue times (you are tracking those, right?) when a user decides they like half a dozen products on your list, opens each one in a new background tab, and suddenly a few dozen requests are hitting your proxy controllers (I really hope those variants are already processed).
… and nginx can solve the problem, but it has a pretty massive footgun if you are not careful
You can configure nginx to store any files served by your app (including those streaming by the proxy controller) on its disk, and serve them directly from cache when they are requested again. This keeps your puma workers free to handle other requests (and protected from slow clients).
However, when you are configuring it to do so, make absolutely sure that you are using proxy_hide_header
and proxy_ignore_header
on the Set-Cookie
header. If you don’t do that, nginx will cache the session cookie, which your app might be using to store sensitive information, such as which user is logged in.
What does that mean? It means that user A is logged in, and navigating normally, and suddenly they are served a cached image with the set cookie of user B. And now user A is logged in as user B. I don’t have to tell you how bad that is, right? Especially if user B is an admin, right?
Serving properly optimized images is hard…
So here’s my recommendation: Resize them to 2x (that is, if you are going to display them as 100x100, resize them to 200x200), since anything higher is unecessary. Then apply one of the transformations below.
If you are using Vips:
# JPEG
user.avatar.variant({ saver: { strip: true, quality: 80, interlace: true, optimize_coding: true, trellis_quant: true, quant_table: 3 }, format: "jpg" })
# PNG
user.avatar.variant({ saver: { strip: true, compression: 9 }, format: "png" })
# WEBP
user.avatar.variant({ saver: { strip: true, quality: 75, lossless: false, alpha_q: 85, reduction_effort: 6, smart_subsample: true }, format: "webp" })
If you are using ImageMagick:
# JPEG
user.avatar.variant({ saver: { strip: true, quality: 80, interlace: "JPEG", sampling_factor: "4:2:0", colorspace: "sRGB", background: :white, flatten: true, alpha: :off }, format: "jpg" })
# PNG
user.avatar.variant({ saver: { strip: true, quality: 75 }, format: "png" })
# WEBP
user.avatar.variant({ saver: { strip: true, quality: 75, define: { webp: { lossless: false, alpha_quality: 85, thread_level: 1 } } }, format: "webp" })
Two things about those options above:
- Make sure you add the
format
keyword. Recently some android phones have started sending images with a.jfif
file extension, and those break libvips, even though they are normal JPEGs. By addingformat: :jpg
you are telling it its fine to treat it as a JPEG. - The options
optimize_coding
,trellis_quant
andquant_table
will only work if your image libraries have been compiled against mozjpeg. If you are using the default libjpeg, they will be ignored. They will also bring a further 20% to 35% reduction in my experience, without any noticeable loss in quality. This is enough to put JPEG on par with WEBP for many images.
… and Lighthouse/PageSpeed are going to try to bully you into a few bad choices
When it comes to images on your pages, there are two things that Lighthouse and PageSpeed will try to convince you into doing: using WEBP and resizing the images to the size at which they will be displayed.
Those are not always good advices. WEBP does not support interlacing, so the image is not displayed until it’s been fully downloaded. JPEG, on the other hand, does, so even if its file size is larger (and it might not be if you’re using mozjpeg), users will perceive it as loading faster because a low-resolution version will show up faster than the full WEBP version.
And when it comes to resizing images, it’s better not to go overboard. Let’s say your app has a product page where you display a large version of the photo along with a list of thumbnails that the user can click to view other photos. It’s better to let the thumbnails use the same variant as the large photo since that means that when the user clicks on them, the file will already be in their browser cache and be displayed immediately.
Every storage service has their own quirks that will drive you insane…
The S3 gem is configured to auto retry failures and has timeouts set to 60 seconds by default, which meant it spends minutes trying to complete a single upload, keeping a worker busy.
Google Cloud Storage does not like if two requests try to update metadata at the same time, which will break your code if you try to run a .analyze
at the same time the ActiveStorage::AnalyzeJob
for a blob.
R2 is still in its infancy, and in the couple of months since we’ve started using it, we’ve already had a few temporary connection problems.
… and if you want to migrate from one to another you and the company’s accountant are going to hate the person who made that decision
Unless you are using R2, you are going to be charged per GB transferred. And since AST does NOT allow you to split your files into folders, that means you have no way of differentiating between original files (which you might have no way of acquiring again) and variant files (which you can generate again). Therefore, you will have to pay to transfer all of them.
Bonus 1: If someone were to write ‘Active Storage, the good parts’, then has_many_attached
would NOT be in it.
Every time we used has_many_attached
we’ve regretted it. Not because it’s a bad feature, but because we always end up needing to do add some extra information to those files, which you can’t do when they are blobs.
For example, you product has many images, so you do this:
class Product < ApplicationRecord
has_many_attached :images
end
Seems obvious. You can even do product.images.first
to display a cover image. Except, a month from now you are taking a look at the page that lists every product you have available, and you notice that in one of the clothing articles the image is a size table. You want to choose another image as the cover image, but you can’t because they are blobs, and you can’t (shouldn’t) add extra attributes to them. So you have to redo your models:
class Product < ApplicationRecord
has_many :photos
end
class Photo < ApplicationRecord
has_one_attached :file
validates :position, presence: true
end
And after you do that you realize you now have to migrate possible millions of images from simple blobs into full models. So do yourself a favor and stick to has_one_attached
.
Bonus 2: You might need a larger connection pool than you think
Recently in the CGRP Slack group there was a discussion about an app throwing ActiveRecord::ConnectionTimeoutError
even though its pool was configured to be equal to Puma’s thread config. @tekin.co.uk dug down until he found out it was being caused by Active Storage’s proxy mode. Check out the post on his blog for more details.