ActiveRecord per user/tenant connection pool

Hi all!

For my multi tenant application i am trying to work out how i can make connection pools per tenant. While stil using

one rails app.

Advantages of making it possible to connect with different usernames/credentials per request.

  • With supplying a per-request user. We can force database level security (for instance schema based isolation)

  • With supplying per request user connection. we can force the postgres 9.5**. row level security.**

This boils down to being able to have a connection pool per user/tenant/connectionspec in active record.

Unfortunately this is extremely difficult thing to do:

Right now i’ve monkey patched connection handler in order to create separate connection pools per tenant. But of course

this is really sup optimal. I see a couple of things that could be done to make this extensible for once and for all:

  • Make it possible to make connection handler pluggable. I’ve tried to do this, but documentation is lacking and i can’t seem to get it work.

  • Make it possible to create connection pools based on specs.

There are a couple of gems like apartment, multidb and other which are just bunch of difficult monkey patches in order to get

this desired behavior.

I hope i made myself clear a bit,

  • Emile
  • Make it possible to make connection handler pluggable. I’ve tried to do this, but documentation is lacking and i can’t seem to get it work.

The connection handler is already pluggable. You can configure it like so:

config.active_record.connection_handler = MyConnectionHandler.new

  • Make it possible to create connection pools based on specs.

I believe that it’s already possible to create connection pools based with a ConnectionSpecification. https://github.com/rails/rails/blob/v4.2.3/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L227-L252

A connection per tenant might cause you to reach your database’s connection limit rather quickly. On top of this, AR isn’t exactly fast at establishing new connections.

I’ve done some research into multitenancy systems and I’ve written a book about it: https://leanpub.com/multi-tenancy-rails-2

The short version is that you should look at the Apartment gem’s Postgres schema switching support if you’re on a small database, or stick with the regular foreign key scoping (ie tenant_id on the records that belong to a particular tenant) if you’re on a large databases.

The problem lies in that if you aren’t connected as a specified user, you can’t use the databases built in features to secure the tenants data like Row Level Security. Or even secure the schema per customer. Simply because the database, postgres in this case, doesn’t know who you are and therefore cant enforce security rules.

Therefore each tenant should have its own db user.

Even if you reach the database connections limit, thats up for the database layer to solve. Rails shouldn’t be in the way of this.

Have you ever manage to get this this to work? Even when override from the current connection handler i can’t seem to get the right behavior oob.

Allowing a connection pool per client is definitely a wrong way of do this, given the high overhead of PG backends and connection limits.

I believe there are other ways to accomplish this even with row level security. For example, you could set a per-connection variable in PG on connection checkout and have your row security policies check that variable rather than the current DB user. Not only will this give you far greater performance, it will also be far more versatile.

The only time you should have different DB users per customer is if you’re running a separate app layer instance(s) per customer. Otherwise you’re creating a flawed design that will bit you later on. Database users exist to restrict at a lower level than multi-tenant-per-app policies; they’re about securing the database for different use cases (such as some apps only needing to write to certain tables, or a reporting user/app only having read access, etc.) I believe you’re misunderstanding the purpose of database level users.

well of thats the case please explain me a practical proved implementation

I don’t currently have easy access to a PG 9.5 box, but I think the following would work:

CREATE TABLE customer_orders (customer_id integer, order_description text);

ALTER TABLE customer_orders ENABLE ROW LEVEL SECURITY;

CREATE POLICY customer_policy ON customer_orders

USING (true)

WITH CHECK (myvars.customer_id = customer_orders.customer_id);

Then in a before_filter on your controllers you use something similar to the following SQL to set the restriction:

SET myvars.customer_id = 10;

See: http://www.postgresql.org/docs/9.5/static/ddl-rowsecurity.html and http://stackoverflow.com/questions/28047911/how-to-set-some-context-variable-for-a-user-connection

Also, pro tip: you’re more likely to get help if you don’t demand a premade solution and and instead demonstrate that you’ve already researched this yourself. Google is your friend, but you don’t really seem to have used it.

i did. the issue at hand is that its hard to mitigate a sql jnjection attack without setting users at connection level.

this wouldnt mitigate it.

If you’re using Active Record’s querying interfaces properly, you shouldn’t be vulnerable to SQL injection attacks.

If everything in the whole world were done perfectly, nothing would ever be vulnerable to anything. But out here in the real world where murphy’s law exists, an attitude of “defense in depth” is more practical, in addition to doing one’s very best to be as perfect as possible.

We’ve done it using MySQL views. You construct all tenant-dependent views with a tenant_id column and set the formula to check against a session variable. We put insert triggers to set the tenant_id initially and an update trigger to throw an error if someone tried to change it (they can’t through our interface but just for security). The views are named like “users” and the data tables are named like “users_data”. AR doesn’t care if it’s pulling from a view, so it thinks it’s just doing a normal query, but since it’s accessing the view it’s impossible for it to get data from other tenants.

Probably not the best approach from a speed perspective, but it has scaled really well for us and we only have to maintain one schema. Additionally we never have to mess with connections because each app server (we run passenger) has its own database connection which means its own session variable which means its own scoping. You just set the variable at the very beginning of each request (for us it’s based on subdomain).