Viktar Patotski ·
· Architecture
· 8 min read
Postgres Row-Level Security for Multi-Tenancy: The Pattern and the Footguns
Row-Level Security moves tenant isolation out of every developer's memory and into the database itself. Here is the exact Postgres pattern for a pooled multi-tenant app, plus the four footguns that leak data in production if you miss them.
TL;DR - In a pooled multi-tenant database, every query needs a
WHERE tenant_id = ...filter, and the day someone forgets one, a customer sees another customer’s data. Row-Level Security (RLS) makes Postgres enforce that filter for you, on every query, automatically. The pattern is small:
ENABLE ROW LEVEL SECURITYon each tenant table, add a policy that checkstenant_idagainst a session variable, andFORCE ROW LEVEL SECURITYso even the table owner obeys it.- Connect your app as a dedicated non-owner, non-superuser role, because owners and superusers bypass RLS by default.
- Set the tenant per request with
SET LOCAL, never plainSET, or a connection pooler will leak one tenant’s context into the next request.- Put
tenant_idas the leading column of every index, or RLS makes your queries scan everything.Get those four right and you have isolation the database enforces, not isolation you hope every developer remembers.
The problem RLS actually solves
The pool model of multi-tenancy keeps every tenant’s rows in shared tables,
told apart by a tenant_id column. It is the cheapest model to run, and the
whole thing rests on one fragile assumption: that every query, forever, written
by every developer, remembers to filter by tenant_id.
That assumption breaks. A new hire writes a report query without the filter. An ORM generates a query that misses it on a join. A quick admin script forgets. Any one of those is a cross-tenant data leak, the kind that ends trust and sometimes contracts.
Row-Level Security flips it. You define the rule once, in the database, and Postgres applies it to every query against the table whether the application remembered to or not. The filter is no longer a convention. It is a guarantee.
The pattern
Three statements set it up on a tenant-scoped table.
-- 1. Turn on row security for the table.
ALTER TABLE app_data ENABLE ROW LEVEL SECURITY;
-- 2. A row is visible only when its tenant_id matches the session's tenant.
CREATE POLICY tenant_isolation ON app_data
USING (tenant_id = current_setting('app.current_tenant')::uuid);
-- 3. Force the table OWNER to obey the policy too (see footgun #1).
ALTER TABLE app_data FORCE ROW LEVEL SECURITY;
Then, on every request, your application tells Postgres which tenant it is acting for, inside the transaction:
BEGIN;
SET LOCAL app.current_tenant = '6f9d...'; -- the current tenant's UUID
-- ... your normal queries, with NO tenant_id filter needed ...
COMMIT;
That is the core of it. Your application code stops writing WHERE tenant_id
everywhere; it just sets the tenant context once per request and runs ordinary
queries. The USING clause controls which rows are visible to reads; it
implicitly also controls writes unless you add a separate WITH CHECK, so a
tenant cannot insert or update rows into another tenant’s id.
Footgun 1: owners and superusers bypass RLS
This is the one that gives people false confidence. Straight from the Postgres
documentation: superusers and roles with the BYPASSRLS attribute always
bypass row security, and table owners normally bypass it too.
So if your application connects as the role that owns the tables (very common in small setups) or as a superuser, your carefully written policy does nothing. Every query sees every tenant. Two things fix it, and you want both:
-- Make even the owner obey the policy.
ALTER TABLE app_data FORCE ROW LEVEL SECURITY;
-- Connect the application as a dedicated role that owns nothing
-- and is not a superuser.
CREATE ROLE app_runtime LOGIN PASSWORD '...';
GRANT SELECT, INSERT, UPDATE, DELETE ON app_data TO app_runtime;
This is exactly what AWS recommends in its own multi-tenant RLS guidance: your
application should connect as a user other than the owner of the database
objects. Run migrations as the owner, run the app as app_runtime.
Footgun 2: SET vs SET LOCAL with a connection pooler
This one only shows up in production, under concurrency, which makes it
dangerous. If you use PgBouncer in transaction mode (most production setups do),
a single backend connection is reused across many clients. If you set the tenant
with a plain SET:
SET app.current_tenant = '...'; -- WRONG under a transaction-mode pooler
that value persists on the connection after your request finishes. The next
client handed that pooled connection inherits your tenant’s context and gets the
wrong tenant’s rows. Use SET LOCAL inside an explicit transaction instead:
BEGIN;
SET LOCAL app.current_tenant = '...'; -- scoped to THIS transaction only
-- queries
COMMIT; -- context is gone, nothing leaks
SET LOCAL (or set_config('app.current_tenant', '...', true)) scopes the
variable to the current transaction, so pool reuse can never carry it forward.
One more pooler note: statement-mode pooling breaks session variables entirely,
so run session mode, or transaction mode with explicit transactions around your
tenant-scoped work.
Footgun 3: the missing composite index
RLS works by adding its policy condition to every query, effectively an implicit
WHERE tenant_id = .... If your indexes do not lead with tenant_id, Postgres
cannot use them to satisfy that condition efficiently and falls back to scanning
far more rows than it should. On a large pooled table this is the difference
between a fast query and a painfully slow one.
The rule: every index on a tenant table leads with tenant_id.
-- Not just an index on the lookup column:
CREATE INDEX ON app_data (tenant_id, created_at);
CREATE INDEX ON app_data (tenant_id, status);
It is cheap to get right when you design the tables and tedious to retrofit once you have data, so do it from the start.
Footgun 4: the unset variable
If a request reaches the database without setting app.current_tenant,
current_setting('app.current_tenant') raises an error and the query fails.
That is arguably the safe failure (no tenant context means no data, rather than
all data), but it surprises people. If you want a query to simply return nothing
instead of erroring when the context is missing, use the missing_ok form:
CREATE POLICY tenant_isolation ON app_data
USING (tenant_id = current_setting('app.current_tenant', true)::uuid);
The true second argument makes a missing setting return NULL instead of
throwing, so the comparison just matches no rows. Choose deliberately: fail
closed with an error, or fail closed with empty results. Never let a missing
tenant context fall through to all rows.
Two ways to carry the tenant identity
There are two common ways to tell Postgres which tenant a session is for:
- A runtime parameter (
app.current_tenant), set per transaction. One application role, the tenant passed as a session variable. This is what the pattern above uses and what AWS recommends, because there is nothing to provision per tenant. - A role per tenant, where the policy keys off
current_user. Strong, but you now manage a Postgres role for every tenant, which does not scale well past a modest number of tenants.
For most pooled SaaS, the runtime parameter is the right call. Reach for per-tenant roles only when you have a specific reason to want database-level role separation.
Summary
Row-Level Security turns the pool model’s biggest weakness, isolation that
depends on every query remembering a filter, into a guarantee the database
enforces. The pattern is small: enable RLS, write a policy against a session
variable, force it on the owner, and connect as a dedicated non-owner role. The
failure modes are specific and avoidable: owners and superusers bypass policies,
plain SET leaks across a connection pooler, missing tenant_id-leading
indexes wreck performance, and an unset tenant variable must fail closed. Handle
those four and you have pooled multi-tenancy you can defend to an auditor.
RLS is one layer. For the full picture of where it fits, see multi-tenant architecture for vertical SaaS and the single-tenant vs multi-tenant decision. For query performance once RLS is in place, the same index discipline runs through database indexing.
Rolling out RLS on a pooled database and want a second pair of eyes before it ships? I do this as part of Scale Readiness. Book a free 30-minute call and we will pressure -test your isolation before a customer finds the gap.