Maintain connections to two databases

Howdy.

My app is required to perform some user-driven CRUD operations on two databases simultaneously. This is not the normal situation, but it applies to a low-activity admin controller.

My current solution involves three models - DbDual, DbPrimary, and DbSecondary. DbDual uses the other two within nested transactions, and DbSecondary does an establish_connection to connect it to the second database.

This seems to work fine, although it's _very_ slow. I assume the slowness is due to the connecting each time the object springs into existence.

So my question is, how can I make a connection (and maintain it) indefinitely?

I've searched the web for solutions, but I'm not seeing what I think I want. I think I just want a simple connection made during boot, and then be able to tell a model to use that connection.

Comments/Suggestions?

Thanks much!

wild guessing, i think rails is single threaded and if you make a connection persistent you will eat all the resources in no time. I believe passenger loads a rails app per connection but as soon as the client it served the the app gets unloaded, but if the connection is permanent i think the app would never unload.

so maybe in rails 3 you could have a pool of activerecord instances and the rest of rails dynamically been loaded by passenger

I'm actually not using Passenger. I use a cluster of 50 mogrels behind an Apache2. Since my app started pre Rails 2.0 (and is only now up to 2.3.8), I'm not using connection pooling (as far as I know :slight_smile: ). My understanding is that each mongrel instance is maintaining its own db connection. I figure having each instance maintain two connections (one to each db) shouldn't be a bad thing.

As far as resources go, this app is the only thing running on a big Sun box. I'm only using a tiny fraction of the 20GB of ram (in fact, my biggest resource problem is pure CPU processing performance). I'm really not a fan of Sun's large multicore "green thread" processors. They're dog slow per core.

For what it's worth, Oracle is the database server. I don't think that should make a difference to my solution.

ill google and give some thought to see how i can help

Your code looks like this?

  class DbSecondary < ActiveRecord::Base     establish_connection :connection_name   end

This does what it says, it connects DbSecondary (and its descendants, if any) to the named database connection. If you suspect that this doesn't work as intended, look what the logs of your DB server say. If you see that new connections to the DB are established for each request, make sure that class caching is really enabled in production environment.

Regarding transactions: don't expect any coordination over and above what you do yourself. ActiveRecord does not give you distributed transactions. I haven't tried this myself, so big grains of salt are in order. Let's say you have code like this

  DbDual.transaction do     DbPrimary.transaction do       # do something in first database     end     DbSecondary.transaction do       # do something in second database     end   end

In execution, this code is translated to two independent transactions, one on each of the databases. The DbPrimary transaction is mapped to a SAVEPOINT/RELEASE SAVEPOINT pair within the DbDual transaction. The DbSecondary transaction is completely independent. However, if this latter transaction raises a (suitable) exception, it still aborts the enclosing DbDual transaction in addition to the DbSecondary one.

If a failure occurs immediately after the DbSecondary.transaction is committed, for example because the connection to the first database was lost, the second transaction will be committed whereas the first one will be rolled back when the DB server decides it has timed out (or whatever else might cause it to drop the transaction).

Michael

Michael,

Yes, I'm making the connection in my model as you describe. That is working fine in the sense that it is connecting to the secondary database.

My problem is, it appears to be connecting every time I instantiate that object. I've traced the entire establish_connection path, and it appears to be creating a new ConnectionPool object each time. What's happening in establish_connection() is a little confusing to me.

As far as I can tell, there is no re-use of connections to the secondary database.

Regarding my transactions, I'm doing it like this:

    prod.transaction do       test.transaction do         begin           prod_result = prod.connection.execute(prod_sql)           test_result = test.connection.execute(test_sql)         rescue           Util.lge "/// update exception"           Util.lge "/// col_names = #{col_names.join(',')}"           Util.lge "/// values = #{values.join(',')}"           Util.lge "/// prod_sql = #{prod_sql}"           Util.lge "/// test_sql = #{test_sql}"           Util.lge "/// exception = #{$!}"           raise ActiveRecord::Rollback         end       end     end

My transaction code seems to work, but I need to test more failure scenarios.

Something's suspicious here. The behavior of establish_connection is correct (I think), but I don't see why it would be called more than twice for each instance of your app. Once each for the primary and secondary database. If establish_connection is called more often than that in the production environment(!) you ought to investigate the reason. Usually, class caching is enabled there and connections are retained across requests.

Another angle of attack is to set a breakpoint (or puts caller) somewhere deep down in the database adapter to see when and from where connections are established.

Michael

I tested this in production mode, and it's _much_ faster. It's fast enough that I'm almost certain that it's not reconnecting to the secondary database each time (at least in production mode). I don't know if it's just overhead of other development behaviors or if it's actually doing things differently (with respect to connections/pools/ reconnects); and I don't have time to investigate deeper.

But thanks for the suggestions and analyses!

Did you originally write that your problem is with behavior in development environment? I was assuming production. The reason for the behavior you notice is simple: In development environment, class caching is disabled, in other words, classes are reloaded on each request so that changes are immediately effective. An intended consequence is that class variables and class instance variables (i.e., instance variables of class objects) are re-initialized again, too. Among those being the variables that hold the database connection.

Michael

Well! That explains a lot. :slight_smile:

This was all in development mode. As slow as it is in development mode, I'd never release it to production. Fortunately, for reasons you explained above, it does not suffer the same performance problems in production mode.

I suppose since previously I was only working with the one database, that connection was maintained outside the scope of caching/non- caching models. But with this new connection defined and used within a model, the non-caching must have been causing a new connection to happen on each use.

Thanks for your time and knowledge.