I'm starting to think about how my application can scale if it needs
to. This is all very premature - I don't even have clear performance
goals yet, much lest perf testing in place, so it may be unnecessary.
However, one quality my data has is that it's isolated into groups of
records that never reference anything outside of that group. This
leads me to think I can have a small group of tables in a main DB to
store user info, and the rest of my tables each have an instance in
multiple databases:
The nice thing about this design is that it would scale very nicely.
If things are slowing down, I can easily bring up another server and
balance the data. Users, would of course track which DB instance the
user connects to when using the app. The users table is referenced
frequently, but rarely updated, so if that becomes a bottleneck I can
probably have a second instance with some reasonable data replication
between the two.
Is there any built-in way to handle this in rails (or a popular plugin
that can do this)? My very initial investigation seems to indicate a
specific table can be stored on another DB, but that ALL of the rows
are stored on that DB. I can't split this up across multiple servers.
As I said, this is premature, so it's more of a curiosity than a
critical problem I need to solve. Just curious if anyone has done
anything along these lines alread.
Is there any built-in way to handle this in rails (or a popular plugin
that can do this)? My very initial investigation seems to indicate a
specific table can be stored on another DB, but that ALL of the rows
are stored on that DB. I can't split this up across multiple servers.
As far as I know, you're right that you can only specify a different DB
on a per-model basis.
But you could make sub-classes of each of your models, one per each db,
and then specify the different DB on each of those sub-classes. But then
when doing a fetch, you'd have to choose the right model to do the fetch
from -- but of course, that's kind of the point of why you can't have
more than one db on just ONE model -- how would Rails know what DB to go
to when you simply tried to do a Model.find on that model?
I'm not really feeling confident that the architecture you're proposing
really is the best solution to scaling up, but if you wanted it, I think
ActiveRecord could support it in that manner -- by first writing your
models, and then providing N sub-classes of each model you want to
'partition' to N databases. You could even provide some code to do this
'dynamically', you wouldn't need to (and wouldn't want to) actually
hand-code a separate file on disk for each model. All the associations
defined on the 'base' model would need to be re-defined for each model
sub-class to use the appropriate destination model-subclass. This could
also be done in an automated dynamic way. And then you'd need to
provide some helper methods to figure out _which_ model to use for a
given fetch (Model.find) operation.
I'm not aware of any plug-in that already does this. I'm not sure it's a
good idea, like I said, but I'm no expert. I think you'd have to write
the logic yourself, but I think it's perfectly do-able within
ActiveRecord's architecture. But no doubt it will end up somewhat more
complicated to get working right than it seems at first, such things
always do.
Oh! I hadn't even considered that. I will look into MERGE tables and
see if that's going to do what I need. I'm not set on this as the
best solution either, just running through possibilities and making
sure I don't exclude any reasonable options with poor design. I've
also put some thought into having specific tables groups of tables
hosted on other DB servers, but that is a much less mysterious
implementation so I didn't need to ask if anyone has tried it
before.
Although completely splitting up tables (ie sharding) is a well known
technique when having a single database server becomes a bottleneck.
Depending on the balance of reads and writes in your application,
there can be gains to be had by doing your writes to the master server
and doing reads from one (or more) slave servers that replicate the
master. There is a plugin (masochism) that used to handle some of
this, I haven't looked at it in a while though.