Multi-Tenant SaaS Architecture Reference

Abstract

This paper documents a reference architecture for production multi-tenant B2B SaaS, covering tenant isolation, identity, billing, observability, and compliance patterns. It is intended as a starting point for greenfield projects and a checklist for existing ones.

The reference stack

Application: TypeScript end-to-end (Next.js or Astro frontend, Node or Bun backend).
Database: Postgres with Row-Level Security.
Cache: Redis or Cloudflare KV.
Background jobs: Postgres-backed queue (river, graphile-worker) for moderate scale.
Identity: Better Auth, Lucia, or a hosted provider (Clerk, WorkOS, Auth0) for SSO-heavy products.
Hosting: Cloudflare or AWS.
Observability: OpenTelemetry → Grafana Tempo / Honeycomb / Datadog.
Billing: Stripe + custom metering layer.

Tenant isolation

Shared schema with Row-Level Security is the default. Every table has a tenant_id column. RLS policies enforce isolation at the database level. Connection pool sets the tenant context per request. Full details in our guide.

Reserve dedicated-database isolation for enterprise customers who require it (and pay accordingly).

Identity

Users belong to tenants. A user can belong to multiple tenants (consultants, agencies).
Authentication produces a session token containing user ID and active tenant ID.
API authorisation checks both user permissions and tenant membership on every request.
SSO/SAML support is enterprise-required. Build it in early or budget for a painful retrofit.

Billing

Stripe handles payment collection. Don't roll your own.
Subscription state is mirrored in your own database for fast access.
Usage-based metering is your responsibility. Stripe's metered billing API helps but doesn't replace the metering layer.
Reconcile your records to Stripe nightly. Disputes are inevitable; have data ready.

Audit logging

Every authenticated mutation is logged with: user, tenant, action, target, before/after state, timestamp.
Logs are stored separately from operational data — different retention, different indexes.
Sensitive actions (auth changes, data exports, deletions) get extra observability.

Observability

Every log line, metric, and trace span is tagged with tenant ID.
Distributed tracing across services. The browser-to-database trace is the minimum.
Per-tenant dashboards for support triage. "Tenant X is reporting issues" needs to be debuggable in seconds.

Compliance scaffolding

Encryption at rest (database, object storage) by default.
Encryption in transit (TLS) by default.
Secret management via a managed service (AWS Secrets Manager, GCP Secret Manager, 1Password Service Accounts).
SOC 2 evidence collection is automated where possible (Vanta, Drata, Secureframe).
Data deletion requests have a documented process. GDPR / CCPA timelines are real.

Background processing

Workers route jobs to the correct tenant context.
Job idempotency is mandatory. Workers will retry.
Per-tenant job rate limits prevent one customer's workload from starving others.
Failed jobs go to a dead-letter queue with alerting.

Caching

Cache keys always include tenant ID. Always.
Cache invalidation strategies depend on the data; prefer stale-while-revalidate for analytics and time-to-live for everything else.
Don't cache things that change per-user unless the cost of staleness is well understood.

API design

REST for most resources. tRPC if you're shipping a TypeScript-only stack.
Idempotency keys on every mutation endpoint.
Cursor-based pagination, not offset.
Strict versioning. API changes break customer integrations.
Rate limiting per tenant + per API key.

Onboarding flow

Self-serve sign-up creates a personal tenant by default.
Team invitations work over email with single-use tokens.
Enterprise customers get manual provisioning with SSO setup.
First-run UX is curated — don't dump new users into an empty product.

Recommendations

Pick boring, proven tools. Most "new" data infrastructure is unnecessary.
Build the seams for the migrations you'll need later. RLS-shared today, dedicated-database in three years.
Invest in observability before you need it. Adding it after the first crisis is too late.
Get billing right from day one. Bad billing is a constant tax on the company.

Conclusion

Multi-tenant SaaS architecture is well-understood in 2026. The mistake isn't usually a wrong technology choice — it's a missing operational discipline (audit logging, observability, compliance) added years later under duress. Build the discipline in from the start, and the architecture takes care of itself.

Web Design

Web Development

Software Development

Featured

Free Discovery Call

Case Studies

On-Premise AI Setup

Training & Fine-Tuning

AI Agents & Automation

AI-Powered Development

Learn

Tools