Connection pooling branch ready for testing

My connection pooling branch is getting close to being ready for
merging. At this point, I'd appreciate help in trying it out in real
applications and seeing how it behaves, as well as reviewing the code
and documentation. As far as the code goes, 95% of the changes are in
active_record/connection_adapters/abstract/connection_{specification,pool}.rb.

http://github.com/nicksieger/rails/tree/connection_pool

Database/connection pool configuration is as follows.

1. With no changes to database.yml, the connection pooling code falls
back into a cached connection-per-thread compatibility mode. Try your
app with this first, and make sure everything still works as expected.

2. To try out the fixed size connection pool, add a "pool: N"
attribute to the configuration (yaml below, or in ruby if that's your
new thing):
  production:
    adapter: ...
    ...
    pool: 4 # share a maximum of 4 connections

In order to take advantage of the pool, you actually need to return
connections to the pool after each request. Eventually this should be
baked into ActionPack, but for now you need to add this bit of code to
app initialization. (Note that this can also be used in #1 without any
harm done.)

  Dispatcher.after_dispatch do
    ActiveRecord::Base.clear_active_connections!
  end

There's also a "wait_timeout" database config attribute that controls
how long the thread will wait for a connection when the pool is
exhausted (default is 5 seconds). If a thread times out waiting for a
database connection, an ActiveRecord::ConnectionTimeoutError (subclass
of ConnectionNotEstablished) is raised.

In the cleanup/deprecation department, I'd like to propose getting rid
of allow_concurrency. If you look at the code currently you'll see it
does have an effect -- it basically adds synchronization around
connection handling methods. But just mentioning allow_concurrency
brings up legacy thoughts of past hacks that didn't really cut the
multi-threaded string cheese. So if possible I'd like to see if
there's a way to handle both single and multiple thread use cases
without it, even though I don't have a good idea yet of how to do
that. Thoughts?

As always, comments appreciated.

/Nick

My connection pooling branch is getting close to being ready for
merging. At this point, I'd appreciate help in trying it out in real
applications and seeing how it behaves, as well as reviewing the code
and documentation. As far as the code goes, 95% of the changes are in
active_record/connection_adapters/abstract/connection_{specification,pool}.rb.

http://github.com/nicksieger/rails/tree/connection_pool

Database/connection pool configuration is as follows.

1. With no changes to database.yml, the connection pooling code falls
back into a cached connection-per-thread compatibility mode. Try your
app with this first, and make sure everything still works as expected.

2. To try out the fixed size connection pool, add a "pool: N"
attribute to the configuration (yaml below, or in ruby if that's your
new thing):
production:
   adapter: ...
   ...
   pool: 4 # share a maximum of 4 connections

In order to take advantage of the pool, you actually need to return
connections to the pool after each request. Eventually this should be
baked into ActionPack, but for now you need to add this bit of code to
app initialization. (Note that this can also be used in #1 without any
harm done.)

Dispatcher.after_dispatch do
   ActiveRecord::Base.clear_active_connections!
end

There's also a "wait_timeout" database config attribute that controls
how long the thread will wait for a connection when the pool is
exhausted (default is 5 seconds). If a thread times out waiting for a
database connection, an ActiveRecord::ConnectionTimeoutError (subclass
of ConnectionNotEstablished) is raised.

In the cleanup/deprecation department, I'd like to propose getting rid
of allow_concurrency. If you look at the code currently you'll see it
does have an effect -- it basically adds synchronization around
connection handling methods. But just mentioning allow_concurrency
brings up legacy thoughts of past hacks that didn't really cut the
multi-threaded string cheese. So if possible I'd like to see if
there's a way to handle both single and multiple thread use cases
without it, even though I don't have a good idea yet of how to do
that. Thoughts?

I'm fine with getting rid of allow_concurrency, as in reality this
should just be a choice between SingleConnection and
FixedSizeConnectionPool. There's not a lot to be gained from skipping
a few mutexes when we have a single operating thread. I'm sure people
can find microbenchmarks which say otherwise, but barring serious
performance degradation I say we just treat this as "choose your N"
and default to 1 rather than "turn on magic threads"

Also I'm not sure we need to support new-connection-per-thread going
forward, it's a great recipe for leaking resources on your database
server. Can anyone think of a reason they'd want this behaviour
rather than a lazy loading connection pool with a ceiling?

I think we could do with some naming tidy ups in there as well as the
meaning has subtly changed for a few of these operations. But bike
shedding can wait a while :slight_smile:

Finally, there doesn't seem to be any verification of connections
either before they're removed from the pool or after they're returned?
Things like pending transactions or dropped connections would likely
break 'interestingly'.

Silly Test App works well with this branch though it seems to
overreact when it runs out of connections.

http://github.com/NZKoz/sillytestapp/tree/master

As always, comments appreciated.

Nice work, looking forward to merging this.

Seems to work well, I did some silly app testing myself with ruby and
jruby.

+1 to removing AR allow_concurrency

Just curious what the difference is between the old cached connection
pool and the fixed size? Is the cached connection pool different that
just configuring the pool size to 1?

Thanks for all your hard work Nick. You totally bailed me out :wink:

-- Josh

If you only ever receive requests on a single thread, there's no difference between that and a connection pool of size 1. But if you ever do any multiple-thread access of connections, you have the potential to create a lot of new connections and/or leak connections (if you don't call verify_active_connections! periodically). With connection pools, you can be assured that you limit the number of connection resources you use.

/Nick

If you only ever receive requests on a single thread, there's no
difference between that and a connection pool of size 1. But if you
ever do any multiple-thread access of connections, you have the
potential to create a lot of new connections and/or leak connections
(if you don't call verify_active_connections! periodically). With
connection pools, you can be assured that you limit the number of
connection resources you use.

I still can't see a reason that we'd want to support this old way of
operating. Just default the pool to size 1 and people using it in a
threaded environment will have to take care to set it higher or have
sucky performance. Is there a potential downside I'm not aware of?