← Back to Insights

IAM: Least Privilege That Actually Holds

Metasphere Engineering 12 min read

A developer hits a permission error deploying a new Lambda function. The deadline is tomorrow. She adds "Effect": "Allow", "Action": "*", "Resource": "*" to the IAM policy because that’s the fastest thing that makes the error disappear. The feature ships. The policy stays in place for the next two years because nobody remembers it exists and no automated process flags it.

A master key cut because the right key wasn’t available. Left on the hook. For two years.

Multiply that across 50 engineers over three years. In cloud environments , the IAM posture is mostly historical accident. Attackers who compromise a single generous role pivot using legitimate APIs. Every action looks normal. CloudTrail shows API calls indistinguishable from developer activity, because the permissions granted are identical. The burglar has a master key. The cameras show someone walking calmly through the front door.

Key takeaways
  • Action: * policies stay in place for years because nobody remembers they exist and no automated process flags them. The fastest fix becomes the permanent architecture. The master key that was supposed to be temporary.
  • Least privilege is enforced by tooling, not policy. IAM Access Analyzer identifies unused permissions. Automated trimming reduces attack surface without developer friction.
  • Service account keys should have maximum 90-day lifetimes. Keys older than that are almost certainly forgotten. Prefer OIDC federation over long-lived keys entirely.
  • Cross-account trust relationships from old migrations are privilege escalation paths. Audit quarterly. Remove what’s no longer needed. The spare key you gave the contractor last year. They still have it.
  • Break-glass access needs an automated audit trail, not just a wiki page with instructions. Emergency admin access without logging is indistinguishable from compromise. The emergency key that doesn’t set off the alarm.

The Wildcard Problem

Every IAM audit tells the same story. Dozens of policies with wildcard actions, wildcard resources, or both. Each one created under deadline pressure. Each one forgotten right after the deployment succeeded. Master keys everywhere. Nobody remembers who cut them or why. The total result is an attack surface that grows without anyone intending it to.

IAM Anti-PatternRiskHow CommonFix
"Resource": "*" wildcardBlast radius = entire accountPervasive in auditsScope to specific ARNs
"Action": "*" admin accessFull admin to any serviceCommon in service rolesGrant minimum required actions
Long-lived access keysNever rotated, leaked in reposKeys routinely outlive their creatorsDynamic credentials via IAM roles
Cross-account trust without conditionsAny role in trusted account can assumeWidespread in multi-account setupsAdd sts:ExternalId condition
Unused permissions retainedAttack surface grows without limitMost permissions go unusedMonthly access analyzer review

IAM Access Analyzer surfaces actual usage: which API calls a role made over 90 days. The security audit that shows which doors each person actually opened. Remove what was never used. Teams trimming monthly see sharp drops in their permission surface within a few quarters.

Privilege escalation chain from developer to admin in 3 hopsAnimated diagram showing how a developer role with PassRole permission chains through a Lambda role and a service role to reach admin privileges. PMapper found this path in 30 seconds. Manual review missed it for 2 years.Privilege Escalation: 3 Hops to AdminLow privilegeHigh privilegeDeveloperdev-team-roleRead + DeployAdminadmin-roleFull account accessiam:PassRoleLambda Rolelambda-exec-role1sts:AssumeRoleService Roleinfra-service-role2Admin policy attached33 hops from developer to admin.PMapper found this in 30 seconds.Manual review missed it for 2 years.
IAM Right-Sizing: Remove What Nobody UsesIAM Right-Sizing: Remove What Nobody UsesAccess Analyzer90 days of usage dataWhich permissions used?Unused PermissionsIdentify: 40 of 60 permissionsnever exercised in 90 daysGenerate PolicyNew policy with only20 used permissionsApply + MonitorDeploy tightened policyWatch for access denied67% of permissions removed. Zero functionality lost. Attack surface cut by two-thirds.

Add a small buffer (12 used + 2-3 related). Downgrade the master key to one that opens 14 doors instead of every door in the building. Scoping too aggressively scares teams away from right-sizing entirely. An engineer whose deployment breaks because IAM was trimmed too tight will never trust the process again. (Break their deploy once and they’ll fight every future trim.)

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "s3:GetObject",
      "s3:PutObject",
      "s3:ListBucket"
    ],
    "Resource": [
      "arn:aws:s3:::order-processing-prod",
      "arn:aws:s3:::order-processing-prod/*"
    ],
    "Condition": {
      "StringEquals": {
        "aws:PrincipalTag/team": "commerce"
      }
    }
  }]
}

Condition keys are the under-used feature of IAM policies. aws:PrincipalTag/team ensures only the commerce team’s roles can touch commerce buckets, even if another role accidentally gets the same S3 permissions. Defense in depth at the policy layer. The key card that checks your department badge before opening the door.

Eliminating Long-Lived Keys

Service account keys are the credentials that refuse to die. They get created for a CI pipeline, pasted into environment variables, copied to a developer’s laptop for local testing, committed to a private repo that isn’t as private as everyone assumes, and then forgotten. A copy of the master key left in the lobby “for the delivery driver.” Still there two years later. The key outlives the project it was created for, the team that created it, and sometimes the engineer who generated it.

Workload identity federation eliminates keys entirely. EC2 instance metadata, Kubernetes projected service account tokens, GitHub Actions OIDC. Short-lived (1 hour), auto-rotated, scoped to the specific workload. Temporary day passes that expire at 5pm. Nothing to steal, nothing to rotate, nothing to commit.

Workload Identity: No More Long-Lived KeysWorkload Identity Federation: Zero Static KeysK8s PodHas service accountNo AWS credentialsOIDC ProviderK8s issues OIDC tokenproving pod identityAWS STSExchanges OIDC tokenfor short-lived AWS credsS3 Access15-minute credentialsAuto-refreshedNo access key. No secret key. No credential to leak. Identity federation is the answer.

Where keys remain unavoidable: 90-day max lifetime, automated rotation via Vault. Most key-based authentication can be replaced with federation in a single quarter. Security engineering covers the migration patterns.

Anti-pattern

Don’t: Create a service account key, paste it into CI/CD environment variables, and consider the integration “done.” The key will outlive the project, the team, and possibly the person who created it. A master key copy left in the lobby for the courier. The courier left the company. The key is still there.

Do: Use OIDC federation. GitHub Actions, GitLab CI, and every major CI platform support native token exchange with cloud providers. No key to steal, rotate, or accidentally commit. Day passes. Not master keys.

Privilege Escalation Path Analysis

iam:PassRole + lambda:CreateFunction = deploy a Lambda executing with admin access. Neither permission looks dangerous alone. Together, they’re a privilege escalation path that most teams would never find by reviewing policies one at a time. Two harmless-looking key cards that, combined, open the vault.

Privilege Escalation: How iam:PassRole Becomes AdminPrivilege Escalation: How iam:PassRole Becomes AdminDeveloper RoleHas iam:PassRole permissionpassesLambda FunctionWith AdminAccess role attachedexecutes asAdmin AccessFull account controlDeveloper with PassRole + Lambda create = effectively AdminFix: restrict PassRole to specific role ARNs. Never allow PassRole to admin-level roles.Add SCP: deny iam:PassRole where role != approved-listPassRole is the most commonly overlooked privilege escalation path in AWS.

Policy graph tools list every path to privileged targets. Typical accounts reveal several non-obvious paths on first analysis. Build this analysis into your IaC pipeline: Terraform adding iam:PassRole triggers escalation path analysis before apply. The engineer learns right away which combinations are dangerous, not months later during an audit. The locksmith who checks whether your new key accidentally opens the vault before cutting it.

JIT Access for Production

Standing admin access to production is a liability that pays no dividend. Nobody needs admin at 3pm on a random afternoon. They need it during incidents, during migration windows, during specific debugging sessions. A permanent badge that opens every door when you only visit the secure wing twice a quarter. JIT access matches the access to the need.

Request elevated access for a specific task. 1-8 hour TTL. Auto-revoke when the window closes. A visitor badge. Works for 4 hours. Auto-deactivates. Read-only production access can be broadly available. Write access requires JIT with an approval workflow. Break-glass for Sev1 incidents bypasses approval but generates a full audit trail. The emergency key behind the glass panel. Use it if you must. It sets off the alarm.

Access TierWho Gets ItHowTTLAudit
Read-only productionAll engineersDefault rolePermanentStandard CloudTrail
Write accessRequesting engineerJIT approval1-8 hoursEnhanced logging + Slack alert
Admin / break-glassOn-call SRESelf-service with audit1-4 hoursFull session recording

Cloud-native observability reduces the need for direct production access in the first place. If your logs and traces are thorough, most debugging doesn’t require SSH. The best door is the one you don’t need to open.

Cross-Account Trust Architecture

Cross-account trust relationships are their own attack surface. Almost never governed as carefully as the roles within a single account. Giving another building’s tenants access to your building. Then forgetting they still have it three years later.

Specify exact role ARNs in trust policies, not account IDs. “Account A can assume” means every identity in Account A can assume. Every person in the other building gets a key to yours. A specific ARN restricts to exactly one. Audit cross-account trust relationships quarterly. Trust relationships created for a specific project and left unrestricted after that project ends are persistent access paths nobody remembers authorizing. The spare key you gave the contractor. The project ended. They still have the key.

The Wildcard Debt IAM policies with Action: * or Resource: * that were created under deadline pressure and persist for years because no automated process flags them and no engineer remembers they exist. Master keys cut in a rush and never collected. In a 50-person engineering org over 3 years, wildcard debt piles up until the IAM posture is mostly historical accident. The attack surface grows with every hire, every project, every deadline-driven shortcut that never gets cleaned up.

What the Industry Gets Wrong About IAM Security

“Least privilege is a policy, not an engineering problem.” Writing a policy document that says “use least privilege” doesn’t change the developer who adds Action: * because the deadline is tomorrow. Least privilege is enforced by automated tooling (IAM Access Analyzer, policy-as-code gates in CI) that makes over-permissioning harder than right-sizing. A sign that says “don’t copy the master key” next to a key-copying machine. Nobody reads the sign.

“Service accounts are fine with long-lived keys.” Service account keys are credentials that never expire, get copied to CI/CD variables, committed to repositories, and pasted in Slack DMs. Master keys photocopied and left in drawers across the building. Prefer OIDC federation for CI/CD. Where keys are unavoidable, enforce 90-day maximum lifetime with automated rotation.

Our take Run IAM Access Analyzer monthly and trim unused permissions. The analysis shows which API calls a role actually made in the past 90 days. Which doors each person actually opened. Scope to those calls plus a small buffer. The teams that adopt this discipline see their IAM attack surface shrink within two quarters without breaking a single workload. The key is the buffer. Trim too aggressively and you break deployments. Trim too conservatively and you’ve done nothing. Two to three extra permissions beyond observed usage is the sweet spot. Downgrade the master key. Don’t take away the key entirely.

That Action: * policy from the deadline scramble? The pipeline rejects it now. “Wildcard actions not permitted.” The developer scopes to exact permissions. Ten minutes of work instead of two years of silent overexposure. Same building. The master keys are collected. Every door has the right lock. Every person has the right card.

Find and Fix Your IAM Attack Surface

Over-privileged roles and forgotten service account keys are silent liabilities. Deep IAM assessment, privilege escalation path mapping, and least-privilege architectures that don’t slow teams down are how you close the gaps before an attacker maps them for you.

Assess Your IAM Posture

Frequently Asked Questions

What does least-privilege actually mean for cloud IAM in practice?

+

Least privilege means each identity can do exactly the operations it needs and nothing more. In AWS, that means scoping IAM policies to specific resource ARNs instead of wildcards and separating read/write into distinct roles. AWS Access Analyzer shows which permissions a role actually used in the past 90 days. Teams that review and trim unused permissions monthly consistently shrink their IAM permission surface within a few quarters.

What is workload identity federation and why replace service account keys?

+

Workload identity federation lets a workload exchange a short-lived runtime token from Kubernetes, GitHub Actions, or EC2 instance metadata for cloud credentials without any static key file. No key means nothing to steal, rotate, or accidentally commit to git. Every major cloud provider supports this natively. Migrating from key-based auth to federation is the single highest-impact IAM improvement most teams can make.

What is a privilege escalation path and how do you find them?

+

A privilege escalation path is a chain of IAM permissions that lets a lower-privileged identity reach admin access. For example, a role with iam:PassRole plus lambda:CreateFunction can deploy a Lambda that runs with an admin role. IAM policy graph analysis tools build a directed graph of all IAM entities and find every path to a privileged target. AWS accounts almost always reveal multiple non-obvious escalation paths when first analyzed.

What is just-in-time access and when should you use it?

+

Just-in-time access grants elevated privileges only for the duration of a specific approved task, then auto-revokes them. Tools like AWS IAM Identity Center and HashiCorp Boundary support JIT workflows with 1-8 hour time-to-live windows. JIT is most valuable for production database access, infrastructure change windows, and incident response. It shrinks standing privilege exposure from always-on to hours.

How should cross-account access be structured in AWS?

+

Use IAM roles with trust policies, never shared credentials. A role in Account B trusts specific role ARNs in Account A, not the entire account. Audit cross-account trust relationships quarterly because they get created for specific projects and left unrestricted long after the project ends, creating persistent access paths nobody remembers authorizing.