ActiveRecord connection pool management

Hi, just a small question on the way connections are managed in ActiveRecord.

From the best I can gather reading the code (and based on behaviour), ActiveRecord will check a connection out of the pool the first time a thread asks for one, then that thread will continue to hang on to the connection until it is closed. What is the reasoning behind this approach as opposed to checking out a connection for each DB operation and letting go of it once the operation is completed?

My team has had a couple of issues with this (leading to us needing to use a connection pool equal to the number of threads being used by our server, which is not really a big problem, but feels a bit counter-intuitive). This post is on much the same subject. I’m really just curious about the way this works.

1 Like

Hi, just a small question on the way connections are managed in ActiveRecord.

From the best I can gather reading the code (and based on behaviour), ActiveRecord will check a connection out of the pool the first time a thread asks for one, then that thread will continue to hang on to the connection until it is closed. What is the reasoning behind this approach as opposed to checking out a connection for each DB operation and letting go of it once the operation is completed?

Constantly checking out a connection then checking it back in seems like it would cause a performance bottleneck. The assumption is that lock contention will become the bottleneck if we have to constantly check in / check out connections. The other side of that is it will cause more complexity throughout AR as we'll have to change every place that touches the connection.

Doing it once per thread, and giving the user control over checking in connections (see `release_connection` on the connection pool) seemed like a good compromise for performance and code complexity.

My team has had a couple of issues with this (leading to us needing to use a connection pool equal to the number of threads being used by our server, which is not really a big problem, but feels a bit counter-intuitive).

Is it counter-intuitive? If you're doing database intensive work, then you'd probably want the same number of connections as you have request threads in order to maximize throughput, otherwise you'll still be blocking other requests at some point (see Amdahl's law).

This pos <https://groups.google.com/d/msg/rubyonrails-core/Z4eiyPnGjwk/21a5KSL_YlIJ&gt;t is on much the same subject. I'm really just curious about the way this works.

If we can prove that the concerns above are unwarranted, then it would be fine to change.

Is it counter-intuitive? If you’re doing database intensive work, then

you’d probably want the same number of connections as you have request

threads in order to maximize throughput, otherwise you’ll still be

blocking other requests at some point (see Amdahl’s law).

Yes, in case you’re doing heavy database work you certainly want to have enough connections available to avoid contention. In our case, we had a couple of short database transactions wrapped around other computational work, including a call to an external service (we know). The thing we found counter-intuitive was that when the external call got slow (as they always do), we started being unable to checkout database connections. Not a particularly common or desirable use-case, I know, and I now get that we can work around this by manually checking out connections, so that part is good.

I’m certainly not suggesting that optimising for the above case is a good idea; like I said, I was just interested. I don’t really come from a web or Ruby background, I’m used to resource sharing systems either forcing you to explicitly checkout/checkin manually or automatically checkout/checkin when you request the resource. I guess threads in web are pretty short lived so it wouldn’t normally be an issue.

> Constantly checking out a connection then checking it back in seems like > it would cause a performance bottleneck. The assumption is that lock > contention will become the bottleneck if we have to constantly check in > / check out connections. >

I dunno, it feels to me like under normal circumstances you'd not see any *more* lock contention. I would *guess* that under the current scheme, the first few web transactions would be fast (no locking at all), but after that you'd end up waiting on average about the same amount of time (since each thread has to wait for a connection before it can get started

I don't really follow. If you have 5 request threads, and 5 connections available, no request thread will contend on a lock for a connection (unless you are using threads, which means you'd have 1 + new threads.count connections used, which I said is off the beaten path).

and then needs to do *all* its operations before it checks it back in). I'm curious, I'll see if I can mock something up to test this using the existing checkout/checkin functionality.

Sounds good! :smiley:

Sorry, I meant in the case where you have, say, a pool of five connections and ten request threads (which is entirely possible given the default Rails and, say Puma configuration). Where you have a thread per connection, I’d expect the existing model to perform very slightly better because you only have to go through the process of acquiring the connection once per thread (rather than going through the locking/synchronising process multiple times). I’d expect them to wait about the same time in this situation when all (or at least a large majority) of the work being done by the app is in the database; the more time you spend processing in the app, the more you could benefit from releasing the connection between database work.

Working on that test case right now, will get back to you :slight_smile:

OK, mock-up is here: GitHub - yesterdays-vigilante/activerecord-connection-pool-tests: Some performance tests for different models of connection pool management.

Ah, I see what you're saying now. TBQH, as it really seems fine if we internally do check connections in. I just don't know how much of an effort it will require. I suspect there are maybe 1 or 2 places that we would have to change to get the most benefit. The main thing I worry about is checking out a connection where we don't actually need to do any database work (like getting the schema cache).

If you're interested in finding the places we need to fix, I'd be happy to merge the patch! :slight_smile:

"it checks out a connection separately for each database action"

Transactions are per connection, how does that approach handle them?

Xavier,

As far as I know, Rails already doesn’t wrap the entire request in a transaction, so that shouldn’t be an issue.

Unless what you’re saying is that operations already wrapped in a transaction block would be affected.

From what I can tell, ActiveRecord only provides one way to manually open a transaction, which is a transaction do ... end block; in this case, if the connection is checked out and in around transaction, it will still work OK; you’ll have the same connection and the same transaction for everything within that block.

Aaron, I have no idea if I’ll be able to work that out, but I’ll give it a shot.

The only time I’ve seen one connection per thread being an issue was in one app that ran many processes and it started to reach the connection limit of their db server in traffic peaks. Even in that scenario, contention is also a risk if you reduce the pool.

Other than the mental image that you’re using less resources (at the price of contention), I am not sure there is going to be any significant practical win. It could be, I am not saying it wouldn’t (I have not done your study), but at first sight the cost/benefit is not clear to me in practical terms.

Regarding transactions, #execute is public AR interface, and

AR::Base.connection.execute(‘START TRANSACTION’)

is valid AR code, I am not sure if drivers know the connection is in a transaction and have API to check, but #transaction_open? returns false as of this writing. Why would anybody do that? I don’t know, maybe because they are writing a heavy SQL oriented script and doing that manually feels natural… it doesn’t matter, it can be done.

Another practical point is that when a connection is checked out the connection pool tests if it is alive issuing for example SELECT 1. That would mean that if a request performs today 200 queries, they would become 400 queries. I don’t know if the alive check could be rethought to run less frequently, but it probably should to avoid that 2x.

If the concern is long-running threads holding DB connections they don’t need, wouldn’t a simpler solution be to explicitly return connections to the pool (via release_connection) before doing a long-running non-DB operation? Checking out and back in on most operations seems like optimizing for that uncommon case at a runtime cost for the common one.

–Matt Jones

We run into problems with connection pools on Heroku.

In Postgres a connection is very expensive. The cheap/free plans on Heroku Postgres all have very small connection limits. The hobby plan is limited to 20 connections https://devcenter.heroku.com/articles/heroku-postgres-plans.

Some people will build really small rack apps that use active record, and want to max out their processes/threads. It’s not that common but it happens.

More commonly people don’t know about this behavior and get ActiveRecord::ConnectionTimeoutError because they’re using default Puma which is set to 16 threads and default AR which is set to 5 connections. I wrote these docs to help people understand the connection requirements of AR https://devcenter.heroku.com/articles/concurrency-and-database-connections

Our larger customers end up hitting our max conneciton limit of our “production” plans which is 500 connections. They end up having to run PgBouncer which does per machine connection multiplexing https://devcenter.heroku.com/articles/concurrency-and-database-connections#limit-connections-with-pgbouncer

Agreed.

The biggest benefit is going to come to people using a DB like Postgres where each connection uses an entire backend (forked worker essentially), and, given the high cost of backends, it’s recommended to keep the number of connections low.

Checking connections in/out of the pool as necessary would, I think, significantly many medium or larger applications using Postgres as it would make it easier to scale the number of Rails backends without impacting the number of connections to PG required.

Good point. That's an additional overhead, especially for systems with DBs that are farther away.

One other point, possibly MySQL-specific: there are connection-specific SQL variables. Under the current system, code can rely on them being set after they are set in the same thread:

ActiveRecord::Base.connection.execute("SET SQL_MODE = ''") # now SQL_MODE for the connection is set to empty string SomeModel.create(...)

With per-DB-interaction checkin/checkout, it doesn't appear to be possible to reliably manipulate these variables, as the connection used in the first statement may not match the one used in the second. More amusing / worrisome, somebody ELSE gets a connection with an altered SQL_MODE...

--Matt Jones

With per-DB-interaction checkin/checkout, it doesn’t appear to be possible to reliably manipulate these variables, as the connection used in the first statement may not match the one used in the second. More amusing / worrisome, somebody ELSE gets a connection with an altered SQL_MODE…

I would personally actually prefer to have to open and close a connection around variables like this, if only because it would serve as a reminder to set, say, SQL_MODE back to what it was; it only feels slightly harder in the current system to hand somebody else a connection with an altered SQL_MODE.

Other than the mental image that you’re using less resources (at the price of contention), I am not sure there is going to be any significant practical win. It could be, I am not saying it wouldn’t (I have not done your study), but at first sight the cost/benefit is not clear to me in practical terms.

It’s not so much the mental image that you’re using less resources as the actual ability to use less resources. The practical benefit is that for applications which don’t use the database to do all their grunt work (which seems to be most of them; rendering seems to take at least as long as a DB round trip for at least most things), you don’t have to saturate your threads with database connections, which means you can scale your app further.

I guess my theory was that there would be negligibly more contention, and my research seems to have backed that up, especially given the below. Happy to be proven wrong!

Regarding transactions, #execute is public AR interface, and

AR::Base.connection.execute(‘START TRANSACTION’)

is valid AR code, I am not sure if drivers know the connection is in a transaction and have API to check, but #transaction_open? returns false as of this writing. Why would anybody do that? I don’t know, maybe because they are writing a heavy SQL oriented script and doing that manually feels natural… it doesn’t matter, it can be done.

This is certainly an issue; I’m not sure how to get around it without either connection-per-thread or #transaction_open?. It does feel like a bad thing to do, though; is keeping support for this really that important?

OK; looks like other systems either do an alive-check on each checkout (I don’t think this necessarily needs to run a query on the database; for example, JDBC provides a connection#isAlive method; I can’t find any source for what this actually does though. Maybe it does just run a query), or they do it scheduled in a background thread, which probably isn’t an option here because of the GIL.

My guess is that because my tests were run with the database so close to the app (on the same machine), this wasn’t really an issue; I guess latency would be the real performance-killer in this case. I’ll rerun the tests with an external database just to be sure.

The other issue I noticed (which I just realised I hadn’t mentioned) was that I had to disable the query cache middleware in order to make checking in and out connections possible; I’m not entirely sure what that middleware is for (it appears to just semi-temporarily enable the query cache on all requests?), but it poses a problem to checkin-checkout (even in the manual sense).