TL;DR
In multi-tenant Rails apps using database-per-tenant, the only knobs we have to bound total connections per app instance are per-pool (max_connections, idle_timeout, max_age, etc.). This means worst-case total connections scale with
tenant_count × max_connections, with no hard ceiling. I’d like to propose an opt-in cross-pool reaping step: when reaping runs, if a global cap is configured and sum(active_connections across pools) > cap, reap idle connections across
pools until under the cap.
Background discussion (which I’m migrating here per CONTRIBUTING): Reap Connections Based on Total Connections Across Pools · rails/rails · Discussion #55529 · GitHub
Problem
We run a multi-tenant app with a separate database per tenant — each tenant gets its own ConnectionPool. As tenant count grows, two things scale badly:
- DB-server side: total connections to a shared Postgres instance can spiral. (PgBouncer/RDS Proxy mitigates this, but is out of scope here.)
- App-instance side: total connections held by one app process can spiral to roughly
N_tenants × max_connections_per_pool + 1. There’s no upper bound enforceable from within Rails.
For (2), today’s mitigation is to tune idle_timeout and reaping_frequency aggressively — but those are probabilistic. They reduce average usage; they don’t guarantee a ceiling. A traffic burst across many tenants can still push total
connections well past what the app’s resources (or the DB’s max_connections) can support.
What 8.1 already gives us
The new options in 8.1 (max_age, min_connections, pool_jitter, refined keepalive) are genuinely helpful — max_age in particular forces recycling of long-lived connections, and pool_jitter reduces synchronized reconnect storms
across many pools. But all of these are per-pool and none provide a global ceiling. If I have 500 tenants and max_connections: 5, I can still legitimately hold 2,500 connections from one app instance during a burst.
Proposed solution (sketch)
Add an opt-in configuration (working name: max_total_connections, settable on ActiveRecord::Base.connection_handler or as a top-level setting) that:
- During the existing reaping cycle, after each pool’s local reap completes, check
sum(connections_in_use_or_idle)across all pools managed by the handler. - If the sum exceeds the configured cap, iterate pools (LRU by last activity? round-robin? — open question) and reap idle connections until the sum is back under the cap.
- Connections in active use are never forcibly closed — the cap is a soft cap that’s enforced by aggressive idle reaping, not by interrupting checked-out work. If demand legitimately exceeds the cap and no idle connections exist to reap,
callers experience the existing
checkout_timeoutsemantics.
This is purely additive: the option defaults to Float::INFINITY (current behavior).
Open questions
- Scope. Should the cap apply per-
ConnectionHandler, per-process, or be configurable? Per-handler seems right (one cap per “role” in the multi-DB sense) but I’d love input. - Reaping order. When over-budget, which pool gives up connections first? Options: LRU by last activity, largest pool first, weighted by
min_connections, or simple round-robin. LRU feels most intuitive but adds bookkeeping. - Interaction with
min_connections. A pool withmin_connections: 2should presumably not be reaped below its floor even when over budget — otherwise the floor is meaningless. Is that the right call, or should the global cap override? - Where to wire it in. Most naturally inside the
Reaper, called once per reaping cycle after each pool’sreapfinishes. But there may be a cleaner place I’m not seeing. - Naming.
max_total_connections?global_max_connections?connection_handler_max_connections?
Happy to put together a patch if there’s appetite for this. Wanted to validate the approach first.