* feat(skills): add saas-multi-tenant — multi-tenant SaaS architecture with RLS, tenant-scoped queries, and isolation patterns * chore: sync generated registry files * fix: address code review — set_config, cleanup, args guard, findUnique * chore(pr411): Remove derived artifacts * chore(pr411): Drop derived files from diff * chore(pr411): Revert count drift --------- Co-authored-by: sickn33 <sickn33@users.noreply.github.com>
12 KiB
name, description, risk, source, date_added, tags, tools
| name | description | risk | source | date_added | tags | tools | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| saas-multi-tenant | Design and implement multi-tenant SaaS architectures with row-level security, tenant-scoped queries, shared-schema isolation, and safe cross-tenant admin patterns in PostgreSQL and TypeScript. | safe | community | 2026-03-28 |
|
|
SaaS Multi-Tenant Architecture
When to Use This Skill
- The user is building a SaaS application where multiple customers share the same database
- The user asks about tenant isolation, row-level security, or data leakage prevention
- The user needs to scope every database query to a specific tenant without manual WHERE clauses
- The user asks about shared-schema vs schema-per-tenant vs database-per-tenant tradeoffs
- The user is implementing admin endpoints that must access data across tenants
- The user needs to add
tenant_idcolumns to an existing single-tenant application - The user asks about PostgreSQL RLS policies for tenant isolation
- The user is building tenant-aware middleware in Express, Fastify, or Next.js API routes
Do NOT use this skill when:
- The user is building a single-user application with no shared infrastructure
- The user asks about authentication only without tenant scoping (use an auth skill instead)
- The user needs general database schema design without multi-tenancy requirements
Core Workflow
-
Determine the tenancy model. Ask the user about their scale expectations and isolation requirements. For most SaaS apps under 1000 tenants, shared-schema with a
tenant_idcolumn on every table is the correct default. Schema-per-tenant adds operational overhead (migrations run N times). Database-per-tenant is only justified when tenants have regulatory data residency requirements. -
Add
tenant_idto every tenant-scoped table. The column must beNOT NULL, typeUUIDorTEXT, and included in every composite index. Never allow a tenant-scoped table to exist without this column — a missingtenant_idis a data leak waiting to happen. -
Set up PostgreSQL Row-Level Security (RLS). Create a policy on each tenant-scoped table that filters rows by
current_setting('app.current_tenant_id'). This acts as a database-level safety net — even if application code forgets a WHERE clause, RLS blocks cross-tenant reads. -
Build tenant-aware middleware. At the start of every request, extract the
tenant_idfrom the authenticated session or JWT claims. Set it on the database connection usingSET LOCAL app.current_tenant_id = '...'inside a transaction. Every subsequent query in that request inherits the tenant scope automatically. -
Scope all ORM queries by tenant. If using Prisma, apply a global middleware that injects
where: { tenantId }into everyfindMany,findFirst,update, anddeletecall. If using Drizzle, create a base query builder that includes the tenant filter. Never rely on developers remembering to add the filter manually. -
Handle tenant-aware migrations. Every new table migration must include
tenant_idas a column. Write a linting rule or CI check that rejects any migration creating a table withouttenant_idunless the table is explicitly marked as global (e.g.,plans,feature_flags). -
Build cross-tenant admin routes separately. Admin endpoints that aggregate data across tenants must bypass RLS explicitly using
SET LOCAL role = 'admin_bypass'or a dedicated database role. These routes must be protected by a separate admin authentication flow — never reuse tenant user sessions for admin access. -
Implement tenant provisioning. When a new customer signs up, create their tenant record, seed default data (roles, settings, onboarding state), and assign the founding user. Wrap this in a database transaction so partial provisioning never leaves orphan records.
Examples
Example 1: PostgreSQL RLS Policy for Tenant Isolation
-- Enable RLS on the table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
ALTER TABLE projects FORCE ROW LEVEL SECURITY;
-- Policy: users can only see rows where tenant_id matches the session variable
CREATE POLICY tenant_isolation ON projects
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
-- Policy for INSERT: new rows must match the current tenant
CREATE POLICY tenant_insert ON projects
FOR INSERT
WITH CHECK (tenant_id = current_setting('app.current_tenant_id')::uuid);
Example 2: Express Middleware That Sets Tenant Context per Request
import { Pool } from "pg";
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
async function tenantMiddleware(req, res, next) {
const tenantId = req.auth?.tenantId; // extracted from JWT during auth
if (!tenantId) return res.status(403).json({ error: "No tenant context" });
const client = await pool.connect();
try {
await client.query("BEGIN");
// Use set_config — SET LOCAL does not accept bind placeholders ($1)
await client.query("SELECT set_config('app.current_tenant_id', $1, true)", [tenantId]);
req.db = client;
req.tenantId = tenantId;
// Cleanup on response finish — guarantees release even if handler skips next()
res.on("finish", async () => {
try { await client.query("COMMIT"); } catch { await client.query("ROLLBACK"); }
client.release();
});
next();
} catch (err) {
await client.query("ROLLBACK").catch(() => {});
client.release();
next(err);
}
}
Example 3: Prisma Middleware for Automatic Tenant Scoping
import { PrismaClient } from "@prisma/client";
// Tables that do NOT have tenant_id (global tables)
const GLOBAL_TABLES = new Set(["Plan", "FeatureFlag", "SystemConfig"]);
function createTenantPrisma(tenantId: string): PrismaClient {
const prisma = new PrismaClient();
prisma.$use(async (params, next) => {
if (GLOBAL_TABLES.has(params.model ?? "")) return next(params);
// Initialize args.where — Prisma passes undefined args for calls like findMany()
params.args = params.args ?? {};
params.args.where = params.args.where ?? {};
// Inject tenant filter on reads (skip findUnique — it only accepts unique-field selectors)
if (["findMany", "findFirst", "count", "aggregate"].includes(params.action)) {
params.args.where = { ...params.args.where, tenantId };
}
// Inject tenant_id on creates
if (["create", "createMany"].includes(params.action)) {
params.args.data = params.args.data ?? {};
if (params.action === "createMany") {
params.args.data = params.args.data.map((d: any) => ({ ...d, tenantId }));
} else {
params.args.data = { ...params.args.data, tenantId };
}
}
// Scope updates and deletes
if (["update", "updateMany", "delete", "deleteMany"].includes(params.action)) {
params.args.where = { ...params.args.where, tenantId };
}
return next(params);
});
return prisma;
}
Never Do This
-
Never query a tenant-scoped table without a
tenant_idfilter. Even if your ORM middleware handles it, raw SQL queries bypass middleware entirely. Every raw query must includeWHERE tenant_id = $1or rely on RLS. A single unscopedSELECT * FROM invoicesleaks every customer's billing data. -
Never store
tenant_idonly in the application session without enforcing it at the database level. Application-layer filtering is a suggestion. RLS is enforcement. If a bug in your middleware skips the tenant filter, only RLS prevents the data leak. Run both layers. -
Never use auto-incrementing integer IDs for tenant-scoped resources. Sequential IDs (
invoice #1042) let attackers enumerate other tenants' resources by incrementing the ID. Use UUIDs for all tenant-scoped primary keys. Reserve integer IDs for internal-only tables. -
Never let tenant users access admin aggregation endpoints. A route like
GET /admin/metricsthat queries across all tenants must never be reachable with a regular tenant JWT. Use a separate authentication mechanism (API key, admin role claim with a different issuer) for cross-tenant routes. -
Never run migrations with RLS enabled on the migration connection. The migration user needs to create tables, add columns, and modify policies. If RLS is active on the migration connection,
ALTER TABLEcommands may silently fail or affect only the "current tenant's" view. Use a dedicated superuser orbypassrlsrole for migrations. -
Never share connection pools across tenants when using
SET LOCAL. If you useSET LOCAL app.current_tenant_idinside a transaction, that setting is scoped to the transaction. But if a previous request's transaction was not properly committed or rolled back, the connection returns to the pool with stale tenant context. AlwaysRESET app.current_tenant_idin the cleanup path.
Edge Cases
-
Tenant deletion and data retention. When a tenant cancels their subscription, you cannot simply
DELETE FROM tenants WHERE id = $1. Foreign key cascades may time out on large datasets. Instead, soft-delete the tenant (setdeleted_at), revoke all user sessions, then run a background job that deletes tenant data in batches over hours or days. -
Tenant data export for GDPR/compliance. When a tenant requests a full data export, you need to query every tenant-scoped table for that
tenant_idand package it. Build a registry of all tenant-scoped tables (parse your migration files or maintain a manifest) so the export job doesn't miss tables added after the export feature was built. -
Shared resources between tenants. Some features require shared state — e.g., a marketplace where Tenant A's products are visible to Tenant B's users. These tables need a different RLS policy: read access is public (no tenant filter), but write access is still scoped to the owning tenant. Model these as
owner_tenant_idinstead oftenant_id. -
Tenant-aware background jobs. When a cron job or queue worker processes tasks, there is no HTTP request to extract
tenant_idfrom. The job payload must includetenant_id, and the worker must set the database session variable before processing. Never run background jobs without tenant context — they will either fail on RLS or bypass it entirely. -
Connection pool exhaustion with schema-per-tenant. If you use one PostgreSQL schema per tenant and each schema requires its own connection pool, 500 tenants means 500 pools. This exhausts
max_connectionsfast. Use a connection pooler like PgBouncer in transaction mode, or switch to shared-schema before hitting this wall.
Best Practices
-
Create a
tenantstable as the single source of truth. Everytenant_idforeign key in every table points back totenants.id. Include columns forname,slug(for subdomain routing),plan_id,created_at, anddeleted_at. This table is the root of your entire data model. -
Index
tenant_idas the first column in every composite index. PostgreSQL uses leftmost prefix matching for composite indexes. An index on(tenant_id, created_at)serves both "all items for tenant X" and "items for tenant X sorted by date." An index on(created_at, tenant_id)only helps date-range queries across all tenants. -
Use subdomains or path prefixes for tenant routing.
acme.yourapp.comoryourapp.com/org/acme— both work. Map the subdomain or path to atenant_idlookup at the edge (middleware or reverse proxy). This lookup should be cached (Redis or in-memory with 60s TTL) since it runs on every single request. -
Separate tenant-scoped tables from global tables explicitly. Maintain a list (code constant or database table) of which tables are global (no
tenant_id) and which are tenant-scoped. Use this list in your ORM middleware, your migration linter, and your data export job. If a table isn't in either list, the CI check should fail. -
Test with at least 3 tenants in your seed data. A single tenant in development hides every multi-tenancy bug. Two tenants hides bugs where the first tenant's data leaks to the second but not vice versa. Three tenants catches ordering and filtering bugs that only appear with multiple peers.
-
Rate-limit and quota per tenant, not globally. A global rate limit of 1000 requests/minute means one noisy tenant can exhaust the quota for everyone. Implement per-tenant rate limiting using a Redis key pattern like
ratelimit:{tenant_id}:{endpoint}with a sliding window counter.