Does anybody use the new horizontal sharding support in ActiveRecord 6.1 combined with schema management and migrations? I found it creates a schema file for every shard, is this intentional? Is there a way to prevent this?
We have a model which is functionally partitioned off from our primary database, and is horizontally sharded. Right now we do this with a database configuration per shard, and a model per shard which does an establish_connection
to the appropriate configuration. Schema is managed for these shards manually.
This is a simplified example:
# config/database.yml
development: ...
chunks_shard_one: ...
chunks_shard_two: ...
module Chunks
class Base < ActiveRecord::Base
# There is no `chunks` table in the primary database, so this model would
# fail if used, but we want all child classes to have the same table name.
self.table_name = "chunks"
end
class ShardOne < Base
establish_connection :chunks_shard_one
end
class ShardTwo < Base
establish_connection :chunks_shard_two
end
SHARDS = {shard_one: ShardOne, shard_two: ShardTwo}
def self.for(supermodel)
SHARDS.fetch(supermodel.shard_id).where(supermodel_id: supermodel.id).all
end
end
I’d like to be able to do this, and have each shard based on the same schema and migrated from the same migrations:
# config/database.yml
development:
primary: ...
chunks_shard_one:
...
migrations_path: db/chunks_migrate
chunks_shard_two:
...
migrations_path: db/chunks_migrate
class ChunkRecord < ActiveRecord::Base
self.abstract_class = true
connected_to shards: {
shard_one: { writing: :chunks_shard_one, reading: chunks_shard_one },
shard_two: { writing: :chunks_shard_two, reading: chunks_shard_two },
}
end
class Chunk < ChunkRecord
end
module Chunks
def self.for(supermodel)
ChunkRecord.connected_to(role: :writing, shard: supermodel.shard_id) do
Chunk.where(supermodel_id: supermodel.id).all
end
end
end
But when I now run:
bin/rails db:prepare
I get two schema files:
db/chunks_shard_one_schema.rb
db/chunks_shard_two_schema.rb
Where I only want:
db/chunks_schema.rb
Poking around in ActiveRecord::Tasks::DatabaseTasks
and friends it looks like the filename is based on the configuration name, like chunks_shard_one
:
Which makes sense until thinking about horizontal sharding.
I can’t see a way to override the filename per configuration, or that might head toward a good solve:
# config/database.yml
development:
primary: ...
chunks_shard: *chunks_shard
schema_path: db/chunks_schema.rb
migrations_path: db/chunks_migrate
chunks_shard_one:
<<: &chunks_shard
...
chunks_shard_two:
<<: &chunks_shard
...
I’ve been starting at this too long and feel like I might be missing something.
Or am I holding it wrong? Is this a gap? Is there a solve planned? Or would a contribution be welcome?