Same table, multiple DBs?

I'm starting to think about how my application can scale if it needs to. This is all very premature - I don't even have clear performance goals yet, much lest perf testing in place, so it may be unnecessary. However, one quality my data has is that it's isolated into groups of records that never reference anything outside of that group. This leads me to think I can have a small group of tables in a main DB to store user info, and the rest of my tables each have an instance in multiple databases:

   Main: Users     Auxilliary #1: Table1, Table2, Table3, ...     Auxilliary #2: Table1, Table2, Table3, ...     ...

The nice thing about this design is that it would scale very nicely. If things are slowing down, I can easily bring up another server and balance the data. Users, would of course track which DB instance the user connects to when using the app. The users table is referenced frequently, but rarely updated, so if that becomes a bottleneck I can probably have a second instance with some reasonable data replication between the two.

Is there any built-in way to handle this in rails (or a popular plugin that can do this)? My very initial investigation seems to indicate a specific table can be stored on another DB, but that ALL of the rows are stored on that DB. I can't split this up across multiple servers.

As I said, this is premature, so it's more of a curiosity than a critical problem I need to solve. Just curious if anyone has done anything along these lines alread.

Brian wrote:

Is there any built-in way to handle this in rails (or a popular plugin that can do this)? My very initial investigation seems to indicate a specific table can be stored on another DB, but that ALL of the rows are stored on that DB. I can't split this up across multiple servers.

As far as I know, you're right that you can only specify a different DB on a per-model basis.

But you could make sub-classes of each of your models, one per each db, and then specify the different DB on each of those sub-classes. But then when doing a fetch, you'd have to choose the right model to do the fetch from -- but of course, that's kind of the point of why you can't have more than one db on just ONE model -- how would Rails know what DB to go to when you simply tried to do a Model.find on that model?

I'm not really feeling confident that the architecture you're proposing really is the best solution to scaling up, but if you wanted it, I think ActiveRecord could support it in that manner -- by first writing your models, and then providing N sub-classes of each model you want to 'partition' to N databases. You could even provide some code to do this 'dynamically', you wouldn't need to (and wouldn't want to) actually hand-code a separate file on disk for each model. All the associations defined on the 'base' model would need to be re-defined for each model sub-class to use the appropriate destination model-subclass. This could also be done in an automated dynamic way. And then you'd need to provide some helper methods to figure out _which_ model to use for a given fetch (Model.find) operation.

I'm not aware of any plug-in that already does this. I'm not sure it's a good idea, like I said, but I'm no expert. I think you'd have to write the logic yourself, but I think it's perfectly do-able within ActiveRecord's architecture. But no doubt it will end up somewhat more complicated to get working right than it seems at first, such things always do.

Jonathan

Oh! I hadn't even considered that. I will look into MERGE tables and see if that's going to do what I need. I'm not set on this as the best solution either, just running through possibilities and making sure I don't exclude any reasonable options with poor design. I've also put some thought into having specific tables groups of tables hosted on other DB servers, but that is a much less mysterious implementation so I didn't need to ask if anyone has tried it before. :slight_smile:

Although completely splitting up tables (ie sharding) is a well known technique when having a single database server becomes a bottleneck. Depending on the balance of reads and writes in your application, there can be gains to be had by doing your writes to the master server and doing reads from one (or more) slave servers that replicate the master. There is a plugin (masochism) that used to handle some of this, I haven't looked at it in a while though.

Fred