IAM: Least Privilege That Actually Holds
A developer hits a permission error deploying a new Lambda function. The deadline is tomorrow. She adds "Effect": "Allow", "Action": "*", "Resource": "*" to the IAM policy because that’s the fastest thing that makes the error disappear. The feature ships. The policy stays in place for the next two years because nobody remembers it exists and no automated process flags it.
A master key cut because the right key wasn’t available. Left on the hook. For two years.
Multiply that across 50 engineers over three years. In cloud environments , the IAM posture is mostly historical accident. Attackers who compromise a single generous role pivot using legitimate APIs. Every action looks normal. CloudTrail shows API calls indistinguishable from developer activity, because the permissions granted are identical. The burglar has a master key. The cameras show someone walking calmly through the front door.
Action: *policies stay in place for years because nobody remembers they exist and no automated process flags them. The fastest fix becomes the permanent architecture. The master key that was supposed to be temporary.- Least privilege is enforced by tooling, not policy. IAM Access Analyzer identifies unused permissions. Automated trimming reduces attack surface without developer friction.
- Service account keys should have maximum 90-day lifetimes. Keys older than that are almost certainly forgotten. Prefer OIDC federation over long-lived keys entirely.
- Cross-account trust relationships from old migrations are privilege escalation paths. Audit quarterly. Remove what’s no longer needed. The spare key you gave the contractor last year. They still have it.
- Break-glass access needs an automated audit trail, not just a wiki page with instructions. Emergency admin access without logging is indistinguishable from compromise. The emergency key that doesn’t set off the alarm.
The Wildcard Problem
Every IAM audit tells the same story. Dozens of policies with wildcard actions, wildcard resources, or both. Each one created under deadline pressure. Each one forgotten right after the deployment succeeded. Master keys everywhere. Nobody remembers who cut them or why. The total result is an attack surface that grows without anyone intending it to.
| IAM Anti-Pattern | Risk | How Common | Fix |
|---|---|---|---|
"Resource": "*" wildcard | Blast radius = entire account | Pervasive in audits | Scope to specific ARNs |
"Action": "*" admin access | Full admin to any service | Common in service roles | Grant minimum required actions |
| Long-lived access keys | Never rotated, leaked in repos | Keys routinely outlive their creators | Dynamic credentials via IAM roles |
| Cross-account trust without conditions | Any role in trusted account can assume | Widespread in multi-account setups | Add sts:ExternalId condition |
| Unused permissions retained | Attack surface grows without limit | Most permissions go unused | Monthly access analyzer review |
IAM Access Analyzer surfaces actual usage: which API calls a role made over 90 days. The security audit that shows which doors each person actually opened. Remove what was never used. Teams trimming monthly see sharp drops in their permission surface within a few quarters.
Add a small buffer (12 used + 2-3 related). Downgrade the master key to one that opens 14 doors instead of every door in the building. Scoping too aggressively scares teams away from right-sizing entirely. An engineer whose deployment breaks because IAM was trimmed too tight will never trust the process again. (Break their deploy once and they’ll fight every future trim.)
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::order-processing-prod",
"arn:aws:s3:::order-processing-prod/*"
],
"Condition": {
"StringEquals": {
"aws:PrincipalTag/team": "commerce"
}
}
}]
}
Condition keys are the under-used feature of IAM policies. aws:PrincipalTag/team ensures only the commerce team’s roles can touch commerce buckets, even if another role accidentally gets the same S3 permissions. Defense in depth at the policy layer. The key card that checks your department badge before opening the door.
Eliminating Long-Lived Keys
Service account keys are the credentials that refuse to die. They get created for a CI pipeline, pasted into environment variables, copied to a developer’s laptop for local testing, committed to a private repo that isn’t as private as everyone assumes, and then forgotten. A copy of the master key left in the lobby “for the delivery driver.” Still there two years later. The key outlives the project it was created for, the team that created it, and sometimes the engineer who generated it.
Workload identity federation eliminates keys entirely. EC2 instance metadata, Kubernetes projected service account tokens, GitHub Actions OIDC. Short-lived (1 hour), auto-rotated, scoped to the specific workload. Temporary day passes that expire at 5pm. Nothing to steal, nothing to rotate, nothing to commit.
Where keys remain unavoidable: 90-day max lifetime, automated rotation via Vault. Most key-based authentication can be replaced with federation in a single quarter. Security engineering covers the migration patterns.
Don’t: Create a service account key, paste it into CI/CD environment variables, and consider the integration “done.” The key will outlive the project, the team, and possibly the person who created it. A master key copy left in the lobby for the courier. The courier left the company. The key is still there.
Do: Use OIDC federation. GitHub Actions, GitLab CI, and every major CI platform support native token exchange with cloud providers. No key to steal, rotate, or accidentally commit. Day passes. Not master keys.
Privilege Escalation Path Analysis
iam:PassRole + lambda:CreateFunction = deploy a Lambda executing with admin access. Neither permission looks dangerous alone. Together, they’re a privilege escalation path that most teams would never find by reviewing policies one at a time. Two harmless-looking key cards that, combined, open the vault.
Policy graph tools list every path to privileged targets. Typical accounts reveal several non-obvious paths on first analysis. Build this analysis into your IaC
pipeline: Terraform adding iam:PassRole triggers escalation path analysis before apply. The engineer learns right away which combinations are dangerous, not months later during an audit. The locksmith who checks whether your new key accidentally opens the vault before cutting it.
JIT Access for Production
Standing admin access to production is a liability that pays no dividend. Nobody needs admin at 3pm on a random afternoon. They need it during incidents, during migration windows, during specific debugging sessions. A permanent badge that opens every door when you only visit the secure wing twice a quarter. JIT access matches the access to the need.
Request elevated access for a specific task. 1-8 hour TTL. Auto-revoke when the window closes. A visitor badge. Works for 4 hours. Auto-deactivates. Read-only production access can be broadly available. Write access requires JIT with an approval workflow. Break-glass for Sev1 incidents bypasses approval but generates a full audit trail. The emergency key behind the glass panel. Use it if you must. It sets off the alarm.
| Access Tier | Who Gets It | How | TTL | Audit |
|---|---|---|---|---|
| Read-only production | All engineers | Default role | Permanent | Standard CloudTrail |
| Write access | Requesting engineer | JIT approval | 1-8 hours | Enhanced logging + Slack alert |
| Admin / break-glass | On-call SRE | Self-service with audit | 1-4 hours | Full session recording |
Cloud-native observability reduces the need for direct production access in the first place. If your logs and traces are thorough, most debugging doesn’t require SSH. The best door is the one you don’t need to open.
Cross-Account Trust Architecture
Cross-account trust relationships are their own attack surface. Almost never governed as carefully as the roles within a single account. Giving another building’s tenants access to your building. Then forgetting they still have it three years later.
Specify exact role ARNs in trust policies, not account IDs. “Account A can assume” means every identity in Account A can assume. Every person in the other building gets a key to yours. A specific ARN restricts to exactly one. Audit cross-account trust relationships quarterly. Trust relationships created for a specific project and left unrestricted after that project ends are persistent access paths nobody remembers authorizing. The spare key you gave the contractor. The project ended. They still have the key.
Action: * or Resource: * that were created under deadline pressure and persist for years because no automated process flags them and no engineer remembers they exist. Master keys cut in a rush and never collected. In a 50-person engineering org over 3 years, wildcard debt piles up until the IAM posture is mostly historical accident. The attack surface grows with every hire, every project, every deadline-driven shortcut that never gets cleaned up.What the Industry Gets Wrong About IAM Security
“Least privilege is a policy, not an engineering problem.” Writing a policy document that says “use least privilege” doesn’t change the developer who adds Action: * because the deadline is tomorrow. Least privilege is enforced by automated tooling (IAM Access Analyzer, policy-as-code gates in CI) that makes over-permissioning harder than right-sizing. A sign that says “don’t copy the master key” next to a key-copying machine. Nobody reads the sign.
“Service accounts are fine with long-lived keys.” Service account keys are credentials that never expire, get copied to CI/CD variables, committed to repositories, and pasted in Slack DMs. Master keys photocopied and left in drawers across the building. Prefer OIDC federation for CI/CD. Where keys are unavoidable, enforce 90-day maximum lifetime with automated rotation.
That Action: * policy from the deadline scramble? The pipeline rejects it now. “Wildcard actions not permitted.” The developer scopes to exact permissions. Ten minutes of work instead of two years of silent overexposure. Same building. The master keys are collected. Every door has the right lock. Every person has the right card.