Kubernetes Multi-Tenancy: Beyond Namespaces

Jun 5, 2025 Metasphere Engineering 9 min read

Six months ago, your team built a shared Kubernetes platform. The first few internal teams onboarded smoothly. Namespaces per team, RBAC configured, NetworkPolicy deployed. Everyone felt good about it. Then the payments team asks to onboard their PCI-scoped workloads. Your security team runs an isolation assessment and asks one question: “Can the marketing team’s pods reach the payment namespace over the network?” You check. With your current Cilium configuration and the default-allow egress policy nobody thought to restrict, the answer is yes. The room goes quiet.

That is the namespace isolation gap. And here is the part that really stings: it is ten times harder to fix with 12 tenants running production workloads than it would have been to design correctly before the first tenant landed. The industry has seen both approaches. The retrofit takes 3-4 weeks of careful policy rollout with extensive testing, late nights, and change management overhead. The upfront design takes 3-4 days.

The Three Isolation Models

Choosing a multi-tenancy model is not a technology decision. It is a risk decision. You are trading isolation strength against operational complexity and infrastructure cost. Get this wrong and you will spend months retrofitting. Get it right and the platform scales smoothly. Understanding what each model actually provides, and what it does not, prevents choosing based on familiarity rather than fit.

Namespace isolation is soft multi-tenancy. RBAC controls which tenants access which namespaces. NetworkPolicy controls inter-namespace traffic. ResourceQuota prevents one namespace from starving the cluster. Here is what namespaces do not give you: kernel-level isolation between workloads on the same node, protection against control plane resource exhaustion from one namespace affecting others, or isolation between CRD installations. That list is longer than most teams realize.

For internal teams with mutual trust, namespace isolation is often sufficient. For external customers or regulated workloads, it almost never is. A simple litmus test: if a tenant compromise in one namespace would trigger a breach notification to another tenant, namespace isolation is not enough. Full stop.

Cluster per tenant provides the strongest isolation available. Separate control planes, separate nodes, independent Kubernetes versions, no shared kernel. The cost is real: significant per-cluster baseline spend on EKS/GKE, plus the operational overhead of managing N clusters. At 5 tenants, this is expensive but manageable. At 50, it demands serious cloud-native infrastructure automation with Cluster API or Crossplane for lifecycle management, plus cross-cluster observability and multi-cluster RBAC federation. Without that automation, your platform team becomes a full-time cluster babysitting service.

vCluster hits the sweet spot for many platforms, and it is the model we reach for most often. Each tenant gets a virtual Kubernetes cluster with its own API server. Tenants can install CRDs, create cluster-scoped resources, and even run kubectl cluster-info without affecting other tenants. The host cluster’s nodes are still shared, so you do not get kernel isolation. But the control plane isolation is substantially stronger than namespaces, and the operational complexity is far lower than per-tenant clusters because host infrastructure is shared. Platforms running 30+ vClusters on a single host cluster typically see about 8 GB of total overhead for the virtual control planes. That economics changes the conversation.

The isolation model sorted, the next thing that breaks is the network.

The NetworkPolicy Gaps Nobody Warns You About

This is where teams get burned. It almost always happens after tenants are already onboarded, which is what makes it so painful to fix. NetworkPolicy has four specific enforcement gaps that the Kubernetes documentation does not make obvious.

Gap 1: Your CNI might not enforce policies at all. This is the one that makes security engineers lose sleep. NetworkPolicy is an API. Enforcement is a CNI feature. Flannel does not enforce NetworkPolicy. If you are running Flannel and you have NetworkPolicy objects in your cluster, those objects are being silently ignored. There are clusters where teams have had carefully crafted NetworkPolicy manifests deployed for months with zero enforcement. Nobody knew. Calico, Cilium, and Antrea enforce policies. Verify yours does before you trust it.

Gap 2: DNS resolution breaks without explicit allow rules. A default-deny ingress and egress policy blocks everything, including DNS lookups to kube-dns on port 53. Every network policy must explicitly allow UDP and TCP traffic to the kube-system namespace on port 53. This causes outages during initial policy rollout because workloads suddenly cannot resolve any service names. This happens on nearly every first policy deployment. It is the mistake that catches every team eventually.

Gap 3: IMDS access is wide open. By default, any pod can reach the cloud provider’s Instance Metadata Service at 169.254.169.254. That IMDS endpoint serves the node’s IAM role credentials. Think about that for a moment. A compromised pod can escalate privileges by fetching the node’s credentials through a simple HTTP call. Block egress to 169.254.169.254 in every tenant network policy, or use the cloud provider’s IMDS v2 with hop limits. Do both if you are serious about security.

Gap 4: Host-networked pods bypass policies entirely. DaemonSets running with hostNetwork: true are not subject to NetworkPolicy. If a tenant can deploy a host-networked pod, they bypass your entire network isolation layer. Cloud security requires that hostNetwork be restricted via admission policy (OPA Gatekeeper or Kyverno) to only platform-managed workloads.

RBAC Complexity at Scale

Network sorted. Now RBAC becomes the problem.

Namespace-per-tenant RBAC starts simple: a Role and RoleBinding per namespace, scoped to tenant users. At 5 tenants, you have maybe 20 RBAC objects. Manageable. At 50 tenants with diverse requirements, CI/CD service accounts, and cross-tenant read access for shared dashboards, that number exceeds 500 objects. The configuration complexity does not grow linearly. It grows closer to quadratically as cross-tenant policies multiply.

The failure mode we see every single time: RBAC drift. Permissions added temporarily and never removed. ClusterRoleBindings granting broad access because namespace-scoped permissions were insufficient for a specific integration. A CI/CD service account that got cluster-admin during a migration because “we will tighten it after the migration” and nobody did. That last one is so common it should be a meme.

The answer at scale is admission controllers. There is no alternative. OPA Gatekeeper or Kyverno policies that validate RBAC resources on creation prevent overly broad bindings, enforce naming conventions, and require ownership labels. The policy runs automatically on every kubectl apply. This is the same principle as automated remediation in reliability engineering: when the system grows past what humans can review manually, enforcement must be automatic.

A practical starting policy set for RBAC governance: block any RoleBinding referencing cluster-admin outside the platform namespace, require a team label on every ServiceAccount, deny ClusterRoleBindings from tenant namespaces, and require all Roles to specify explicit resource names rather than wildcards.

Resource Isolation: The Noisy Neighbor Problem

RBAC governs who can do what. Resource isolation governs how much they can consume. Both break at scale if you are not deliberate.

ResourceQuota sets hard limits per namespace: maximum CPU requests, maximum memory limits, maximum pod count, maximum storage claims. LimitRange sets defaults and maxima for individual containers, preventing pods without resource specs from consuming unbounded node resources. You need both. Not one. Both.

Here is the practical challenge: getting the numbers right. Quotas set 50% above average usage still allow noisy neighbor spikes that degrade co-located tenants. Quotas set at the 99th percentile waste capacity you are paying for. There is no magic formula.

Our approach: profile actual consumption for 2-4 weeks, set initial quotas at the 95th percentile plus 20% headroom, then review quarterly. Start conservative and expand on request with documented justification. Tightening quotas retroactively after a noisy neighbor incident that degraded three other tenants is a much harder conversation than starting tight. Trust us on this one.

Node affinity and dedicated node pools add another isolation layer for tenants with specific hardware or compliance needs. Taint-and-toleration patterns ensure tenant workloads land only on designated nodes and those nodes reject workloads from other tenants. For the security implications of shared-kernel multi-tenancy, the fundamental question is whether your threat model requires physical compute isolation or whether strong logical controls are sufficient. Answer that question before you onboard the first tenant. Not after you discover the gap during an audit.

The platform engineering discipline for multi-tenant Kubernetes comes down to building the automation and policy guardrails before the tenant count makes manual management impossible. Every platform we have seen succeed started with strong defaults and loosened them selectively. Every platform we have seen struggle started permissive and tried to tighten under pressure. The pattern is consistent enough to be a law: start tight, loosen deliberately. The reverse direction is a project that never ends.

Frequently Asked Questions

What isolation does a Kubernetes namespace actually provide?

Namespaces provide soft isolation only. Pods in different namespaces share the same kernel, control plane API, and node resources unless ResourceQuota and LimitRange are explicitly configured. A container escape in one namespace compromises every namespace on that node. CVE-2022-0185 allowed a single unprivileged container to gain root on the host. For tenants with regulatory isolation requirements, namespace isolation alone is insufficient.

What is vCluster and when should you use it?

vCluster runs a virtual Kubernetes cluster inside a namespace of the host cluster, with its own API server, scheduler, and controller manager while pods execute on shared host nodes. A single host cluster can run 50+ virtual clusters at roughly 256 MB memory overhead per instance. Use it when tenants need CRD installation or cluster-admin access but dedicated clusters are cost-prohibitive.

How do you prevent noisy neighbor resource exhaustion in a shared cluster?

Both LimitRange and ResourceQuota are required. Without LimitRange, a single pod can consume an entire node’s 64 GB of memory. Without ResourceQuota, a namespace can schedule pods across every node in the pool. Set quotas by profiling actual usage over 2-4 weeks, starting at the 95th percentile plus 20% headroom. Review and adjust quarterly based on observed consumption patterns.

What are the network policy gaps teams commonly miss?

NetworkPolicy objects only work if your CNI plugin enforces them. Calico and Cilium enforce policies. Flannel does not. Policies must explicitly allow DNS traffic to kube-dns on port 53, or workloads lose name resolution. Egress to 169.254.169.254 must be blocked to prevent SSRF to cloud IMDS. Host-networked pods like most daemonsets bypass NetworkPolicy entirely.

At what scale does cluster-per-tenant become necessary?

Cluster-per-tenant is warranted when tenants need conflicting Kubernetes versions, cluster-admin access, or regulatory-mandated physical isolation. The operational break-even is around 10-15 tenants. Below that, manual cluster management works. Above 15, you need lifecycle automation via Cluster API or Crossplane, which takes 3-6 months to build well. Budget roughly 150-300 USD per month per idle cluster on major cloud providers.