Last month, 37Signals introduced Thruster, a “zero-config” gem that makes your web pages load faster by solving various problems that would otherwise require changes in multiple places in your infrastructure.
This post explains what those problems are, how Thruster solves them, and why you might want to use it even if you have a CDN like Cloudflare in front of your app, which already implements all the features that Thruster offers.
1. Puma is a great application server, but a poor web server
Both web and application servers work by receiving requests and responding with content. The difference between them lies in the type of content they specialize in:
- Web: They are designed to handle static content. That is, content that is saved to a file, like the assets of your app (CSS, JS, images, etc.). The most well-known are Nginx and Apache.
- Application: They are designed to handle dynamic content. That is, content that your app generates on the fly for that specific request, like HTML or JSON. For a Rails app, these include Puma, Unicorn, and Passenger.
So, when a user types the URL of your app (e.g., https://vinklo.com.br), Puma does a great job at generating and returning the HTML, but when the browser starts requesting the assets it needs to actually render the page… not so much.
1.1 What’s wrong with Puma?
The problem is that Puma does not support two critical features of web servers:
HTTP/2
Without this, browsers are forced to request one asset at a time per connection. And since they are limited to 6-8 connections per domain and most pages have dozens of assets, this means they need to make multiple round trips to fetch everything they need.
Zero Copy
When transferring an asset, Puma has to copy its content from the disk to its memory, and then copy it again from its memory to the network. While this is happening, one of Puma’s threads is “stuck” handling the copy, using CPU time and memory.
A web server like Nginx handles the transfer by leveraging a Linux kernel command called sendfile()
. When it receives the request, instead of handling the copy itself, Nginx tells the OS to handle it, which allows it to immediately handle the next request. The OS, in turn, uses the DMA controller to set up a transfer directly between the disk and the network.
This not only saves CPU time and memory but is also much faster. Some files get transferred in less than one-third of the time.
2. The three ways production apps solve this problem
Now that we know why forcing Puma to serve files is a bad idea, the question is, how do we solve this problem?
2.1 Nginx
If you have control of your infrastructure, one option is to use Nginx. How this will be done depends on if you are using Nginx as a load balancer or companion to Puma, and if you need to handle Active Storage files or not.
2.1.1 Running one Nginx process in each VM that has a Puma process
Advantages
- You can use your cloud host’s load balancer.
- One less single point of failure in your infrastructure.
- No need to worry about downtime when the load balancer VM needs to restart to apply security updates.
- Nginx can serve files straight from the
public
folder of your app, without creating a copy for itself.
Disadvantages
- If you are using Active Storage, Nginx wil have to create a local copy of the file. This means that each nginx process will have to create it’s own copy, so you will still have multiple requests for the same asset reaching Puma.
- If your infrastructure is setup in a way that deploying means replacing a docker container, your Nginx cache will be discarded too, unless you mount a folder from the host machine onto the VM folder that Nginx uses for storage.
2.1.2 Running a single Nginx process, in a dedicated VM, as the load balancer
Advantages
- Having all assets in a single VM makes it less expensive if you want to pay for attached nVME SSD disk;
- Active Storage files will be cached in a single request and never reach any Puma instance again;
- No need to worry about deploys clearing nginx cache if you are replacing Docker containers;
Disadvantages
- You can’t use your cloud host’s load balancer;
- It’s an extra single point of failure in your infrastructure;
- Unless you are paying extra for enterprise features of your linux distro, some security updates will require a restart, which will take your app down;
Neutral
- You can’t serve asset files directly from disk, since they aren’t there, so you will need to configure nginx to create a local copy the first time it sees them (this is what I do). Or change your deploy process to ensure the load balancer VM has the new assets;
2.2 Traditional CDNs (Cloudfront)
Upload your files to a storage service (e.g., S3), and then let the cloud provider CDN (e.g., Cloudfront) handle serving the files instead of your servers.
Downsides: Extra setup work; your deploy process is more complicated because you have to upload new CSS/JS files to the CDN; and your Active Storage will place extra load on your database because you will have to use public URLs.
2.3 Reverse proxy CDNs (Cloudflare)
Advantages
- Easiest one to configure. Just enable proxy mode
- No extra component to monitor
- Automatic compression of assets and polishing of images
- Works for both assets and active storage
Disadvantages
- Caching only starts on the third request.
- Cache is not shared between data centers and there are 300+ of them, which means 600+ requests will reach Puma for every new asset.
- Just because something is in the cache it does not mean it will stay there since retention and freshness are different things and we can only control freshness.
- If you use Active Storage and have a lot of images (eg: ecommerce website) Puma will never stop serving assets.
3. The fourth option: Thruster
Even though it’s distributed as a gem, Thruster is almost entirely written in Go. If you add it to your Gemfile and run bundle open thruster
in your terminal, you will see this directory tree:
thruster
|-- exe
| |-- x86_64-darwin
| | |-- thrust
| +-- thrust.rb
+-- lib
|-- thruster
| |-- version.rb
+-- thruster.rb
|-- MIT-LICENSE
|-- README.md
Aside from the thrust.rb
file, which is what you must execute to initialize Thruster, the only other thing the gem contains is an executable for the platform you have the gem installed on. This means that all configuration environment variables must be set as UNIX environment variables, not Ruby environment variables (e.g., .rbenv-vars
).
This executable is a proxy server that sits between Puma and your load balancer, providing some of the features that are missing in Puma/Rails:
- HTTP/2 support
- Zero-copy file transfers by caching assets after Puma served them once
- Automatic SSL certificate management with Let’s Encrypt
- Asset compression
Comparing to the other options:
Advantages
- Zero configuration
- It’s a gem and replaces the default puma command, so it works even in Heroku
Disavantages
- It caches entirely in memory and only assets, not Active Storage.
- Each Thrust process will need it’s own cache, and each restart for deploy will wipe that cache.
- Binary is 10MB, which will eat some of the 500MB limit you have on Heroku.
- Almost no documentation (which is why I’m writing this)
4. Should I use Thruster or one of the three other options?
The answer to that will depend on what your infrastructure looks like:
- Cloudflare only: You do not rely on Active Storage.
- Cloudflare + Thruster: You rely on Active Storage and are using a PaaS like Heroku to deploy your app, since those usually won’t let you run multiple processes in a single VM.
- Cloudflare + Nginx: You rely on Active Storage and built your own infrastructure (using Kamal or by hand) and can afford to have an extra VM as your load balancer and know how to configure Nginx for caching asset and active storage routes.
- Thruster only: You cannot use a reverse proxy CDN and are using a PaaS like Heroku. Just be aware that Heroku’s router only supports HTTP/1.1, when communicating with clients, so even if you add Thruster you won’t get HTTP/2
- Nginx only: You cannot use a reverse proxy CDN but you built your own infrastructure.
- Traditional CDN: Personally, I can’t see a reason why I would ever want to use one. They just make your deploy process and your app configuration more complex.
5. Anything else?
Yes, I’m tracking a bug and one possible bug:
- Thruster not working in new projects due to the “irb” gem: #16
- Thruster caching doesn’t seem to be working: #17
5.1 Changelog
- 04/12 - Updated Nginx section to account for Xavier’s comment about reading files from disk and placing Nginx on the same VM as Puma.
- 04/12 - Added “advantages” and “disadvantages” to Thruster.
- 04/13 - Grammar fixes