Connection pooling branch ready for testing

My connection pooling branch is getting close to being ready for merging. At this point, I'd appreciate help in trying it out in real applications and seeing how it behaves, as well as reviewing the code and documentation. As far as the code goes, 95% of the changes are in active_record/connection_adapters/abstract/connection_{specification,pool}.rb.

http://github.com/nicksieger/rails/tree/connection_pool

Database/connection pool configuration is as follows.

1. With no changes to database.yml, the connection pooling code falls back into a cached connection-per-thread compatibility mode. Try your app with this first, and make sure everything still works as expected.

2. To try out the fixed size connection pool, add a "pool: N" attribute to the configuration (yaml below, or in ruby if that's your new thing):   production:     adapter: ...     ...     pool: 4 # share a maximum of 4 connections

In order to take advantage of the pool, you actually need to return connections to the pool after each request. Eventually this should be baked into ActionPack, but for now you need to add this bit of code to app initialization. (Note that this can also be used in #1 without any harm done.)

  Dispatcher.after_dispatch do     ActiveRecord::Base.clear_active_connections!   end

There's also a "wait_timeout" database config attribute that controls how long the thread will wait for a connection when the pool is exhausted (default is 5 seconds). If a thread times out waiting for a database connection, an ActiveRecord::ConnectionTimeoutError (subclass of ConnectionNotEstablished) is raised.

In the cleanup/deprecation department, I'd like to propose getting rid of allow_concurrency. If you look at the code currently you'll see it does have an effect -- it basically adds synchronization around connection handling methods. But just mentioning allow_concurrency brings up legacy thoughts of past hacks that didn't really cut the multi-threaded string cheese. So if possible I'd like to see if there's a way to handle both single and multiple thread use cases without it, even though I don't have a good idea yet of how to do that. Thoughts?

As always, comments appreciated.

/Nick

My connection pooling branch is getting close to being ready for merging. At this point, I'd appreciate help in trying it out in real applications and seeing how it behaves, as well as reviewing the code and documentation. As far as the code goes, 95% of the changes are in active_record/connection_adapters/abstract/connection_{specification,pool}.rb.

http://github.com/nicksieger/rails/tree/connection_pool

Database/connection pool configuration is as follows.

1. With no changes to database.yml, the connection pooling code falls back into a cached connection-per-thread compatibility mode. Try your app with this first, and make sure everything still works as expected.

2. To try out the fixed size connection pool, add a "pool: N" attribute to the configuration (yaml below, or in ruby if that's your new thing): production:    adapter: ...    ...    pool: 4 # share a maximum of 4 connections

In order to take advantage of the pool, you actually need to return connections to the pool after each request. Eventually this should be baked into ActionPack, but for now you need to add this bit of code to app initialization. (Note that this can also be used in #1 without any harm done.)

Dispatcher.after_dispatch do    ActiveRecord::Base.clear_active_connections! end

There's also a "wait_timeout" database config attribute that controls how long the thread will wait for a connection when the pool is exhausted (default is 5 seconds). If a thread times out waiting for a database connection, an ActiveRecord::ConnectionTimeoutError (subclass of ConnectionNotEstablished) is raised.

In the cleanup/deprecation department, I'd like to propose getting rid of allow_concurrency. If you look at the code currently you'll see it does have an effect -- it basically adds synchronization around connection handling methods. But just mentioning allow_concurrency brings up legacy thoughts of past hacks that didn't really cut the multi-threaded string cheese. So if possible I'd like to see if there's a way to handle both single and multiple thread use cases without it, even though I don't have a good idea yet of how to do that. Thoughts?

I'm fine with getting rid of allow_concurrency, as in reality this should just be a choice between SingleConnection and FixedSizeConnectionPool. There's not a lot to be gained from skipping a few mutexes when we have a single operating thread. I'm sure people can find microbenchmarks which say otherwise, but barring serious performance degradation I say we just treat this as "choose your N" and default to 1 rather than "turn on magic threads"

Also I'm not sure we need to support new-connection-per-thread going forward, it's a great recipe for leaking resources on your database server. Can anyone think of a reason they'd want this behaviour rather than a lazy loading connection pool with a ceiling?

I think we could do with some naming tidy ups in there as well as the meaning has subtly changed for a few of these operations. But bike shedding can wait a while :slight_smile:

Finally, there doesn't seem to be any verification of connections either before they're removed from the pool or after they're returned? Things like pending transactions or dropped connections would likely break 'interestingly'.

Silly Test App works well with this branch though it seems to overreact when it runs out of connections.

As always, comments appreciated.

Nice work, looking forward to merging this.

Seems to work well, I did some silly app testing myself with ruby and jruby.

+1 to removing AR allow_concurrency

Just curious what the difference is between the old cached connection pool and the fixed size? Is the cached connection pool different that just configuring the pool size to 1?

Thanks for all your hard work Nick. You totally bailed me out :wink:

-- Josh

If you only ever receive requests on a single thread, there's no difference between that and a connection pool of size 1. But if you ever do any multiple-thread access of connections, you have the potential to create a lot of new connections and/or leak connections (if you don't call verify_active_connections! periodically). With connection pools, you can be assured that you limit the number of connection resources you use.

/Nick

If you only ever receive requests on a single thread, there's no difference between that and a connection pool of size 1. But if you ever do any multiple-thread access of connections, you have the potential to create a lot of new connections and/or leak connections (if you don't call verify_active_connections! periodically). With connection pools, you can be assured that you limit the number of connection resources you use.

I still can't see a reason that we'd want to support this old way of operating. Just default the pool to size 1 and people using it in a threaded environment will have to take care to set it higher or have sucky performance. Is there a potential downside I'm not aware of?