Enterprise IAM: Least Privilege and Workload Identity
Mid-afternoon. A developer hits a permission error deploying a new Lambda function. The deadline is tomorrow. She adds "Effect": "Allow", "Action": "*", "Resource": "*" to the IAM policy because that is the fastest thing that makes the error disappear. The feature ships. The policy stays in place for the next two years because nobody remembers it exists and no automated process flags it. This pattern shows up at nearly every organization.
Now multiply that across a 50-person engineering organization over three years. You end up with an IAM posture that is mostly historical accident. Roles with permissions nobody uses. Service account keys that haven’t been rotated since 2022. Cross-account trust relationships created for a migration project that ended 18 months ago. And somewhere in that tangle, a privilege escalation path that lets an attacker who compromises a single developer workstation reach full admin access in production. It’s always there. The only question is whether you find it first.
The Verizon Data Breach Investigations Report consistently identifies credential misuse as a leading attack vector. In cloud environments, that almost always means IAM. Attackers who compromise a workload with a generous IAM role pivot across services, exfiltrate data from S3 buckets, and establish persistence through new IAM users. All of it uses legitimate cloud APIs, which makes detection dramatically harder than stopping traditional network intrusions.
The Wildcard Problem
IAM policies with "Resource": "*" are the most common misconfiguration in cloud environments. They show up because they’re convenient: a wildcard means you never update the policy when you add new resources. They persist because right-sizing feels lower priority than building features. That’s the wrong tradeoff, but it’s a trap that catches every team eventually.
The risk is concrete. A service with s3:GetObject on "Resource": "*" can read every bucket in the account, including ones it has no business accessing. A role with iam:PassRole on "Resource": "*" can assign any role in the account to any service it creates. When that service gets compromised (not if, when), the blast radius is the entire account.
AWS IAM Access Analyzer surfaces actual permission usage: which specific API calls a role has made over the past 90 days. That data is the evidence base for right-sizing. You’re not guessing what permissions to remove. You’re removing what was never used. GCP Policy Analyzer and Azure Advisor provide equivalent capabilities for their respective clouds. Teams that run this analysis monthly and trim unused permissions typically reduce their IAM surface by 40-60% within two quarters without breaking a single workload.
One critical detail most guides skip: add a small buffer when right-sizing. If a role used 12 permissions in the past 90 days, scope to those 12 plus 2-3 closely related permissions needed for common operational scenarios (like read access alongside the write access already exercised). Scoping too aggressively causes operational failures that scare teams away from right-sizing entirely. That’s worse than the original problem.
Eliminating Long-Lived Keys
Service account keys are a liability with no natural expiration. They sit in environment variables, get committed to code repositories, get copied between environments, and shared in Slack DMs. Every copy is a breach vector that persists until someone explicitly revokes it. And in most organizations, nobody tracks where the copies went. Active keys turn up in repos that were “archived” two years ago.
Workload identity federation eliminates the key entirely. This is the single highest-impact IAM improvement most teams can make. An EC2 instance assumes an IAM role via instance metadata profile. No key file. A Kubernetes pod uses a projected service account token to authenticate to cloud APIs via OIDC. No key file. A GitHub Actions workflow assumes an AWS IAM role using GitHub’s OIDC provider. No key file. In each case, the credential is short-lived (typically 1 hour), automatically rotated, and scoped to the specific workload. Nothing to steal, nothing to rotate, nothing to accidentally commit.
For the cases where static keys remain necessary (third-party integrations that only support key-based auth, certain legacy systems), enforce a 90-day maximum lifetime with automated rotation. HashiCorp Vault’s dynamic secrets engine issues short-lived credentials on demand and revokes them automatically. The point is removing human discipline from the rotation process. Any control that depends on someone remembering to do something will eventually fail. That’s not cynicism. That’s operational reality.
The migration path is straightforward: inventory all service account keys using aws iam list-access-keys across accounts, identify which workloads each key authenticates, and replace them one by one with identity federation. Most teams find that 70-80% of their keys can be replaced with federation in a single quarter. The remaining 20-30% require vendor coordination or legacy system updates. The security engineering practice covers the migration patterns for each major integration type.
Privilege Escalation Path Analysis
This is where IAM gets genuinely dangerous. Individual policies often look reasonable in isolation. The danger is in combinations. A role with iam:PassRole can assign any eligible role to an AWS service, creating an indirect path to higher privilege. A role with lambda:CreateFunction plus iam:PassRole pointing at an admin role can deploy a Lambda that executes with full admin access, even though neither permission alone looks dangerous. This is the attack chain that catches security teams who review policies one at a time.
Manual review cannot find these paths reliably at any scale. Don’t try. PMapper constructs a directed graph of all IAM entities and permissions in an account, then enumerates every path from a starting identity to a privileged target. AWS accounts typically have 3-8 non-obvious escalation paths when first analyzed. Some accounts have over 20. Every one of those paths is a breach chain that a determined attacker will find. The question is whether you find them first.
Run PMapper on every significant IAM change, not just quarterly. Build it into your infrastructure-as-code pipeline so that a Terraform plan adding iam:PassRole to a role triggers an escalation path analysis before the change is applied. If a new path is created, the PR gets flagged for security review. This is far cheaper than discovering the path during an incident.
Just-in-Time Access for Production
Standing privileged access to production is a risk you accept every second it exists. Every hour that credential stays active is another hour an attacker could use it.
Just-in-time (JIT) access flips the model: engineers request elevated access for a specific task with a defined time window. A manager or automated policy approves. The access is granted for 1-8 hours (depending on the task), fully logged, and automatically revoked when the window expires. Tools like AWS IAM Identity Center (formerly SSO), HashiCorp Boundary, and Teleport support this workflow natively.
You will hear resistance from engineers: “I need production access for debugging, and I need it now, not after an approval workflow.” Fair point. Handle it with tiered access. Read-only production access (logs, metrics, traces) should be broadly available without JIT. Write access to production data and infrastructure changes requires JIT. Emergency break-glass access exists as a documented, audited bypass for genuine Severity 1 incidents. This pairs naturally with cloud-native architecture patterns where observability reduces the need for direct production access.
Cross-Account Trust Architecture
Multi-account AWS architecture is a security best practice: production isolated, development separated, shared services in dedicated accounts. But the cross-account trust relationships that make this work are their own attack surface if not governed carefully. And they’re almost never governed carefully.
Every cross-account role trust policy should specify exact role ARNs in the trust condition, not entire account IDs. A trust policy that says “Account A can assume this role” means every identity in Account A can assume it. A trust policy that says “arn:aws:iam::ACCOUNT_A:role/cicd-deployer can assume this role” restricts it to a single, specific identity. That specificity is the difference between controlled access and an open door.
Audit cross-account trust relationships quarterly. They are the IAM equivalent of firewall rules: created for specific purposes, rarely reviewed, and gradually accumulating into an accidental architecture that nobody fully understands. Every trust relationship should have an associated comment or tag explaining why it exists and when it should be reviewed. The cloud-native IAM practices that prevent trust relationship sprawl start with this documentation at creation time. If you can’t explain why a trust relationship exists, it probably shouldn’t.