Data Encryption: Keys, Rotation, and Field-Level Protection

Nov 10, 2025 Metasphere Engineering 13 min read

Full disk encryption. TLS 1.3 everywhere. Auditor signed off last quarter.

Then a SQL injection in a search endpoint hands an attacker authenticated database access through the application tier. They pull hundreds of thousands of records. Perfectly decrypted. Because disk encryption protects against someone stealing the physical drive, not against someone who walks in through the application’s front door. Every compliance checkbox ticked. None of them relevant to the actual attack vector.

The building has a strong front door. The attacker came in through the lobby. The safety deposit boxes were wide open.

Key takeaways

Disk encryption stops stolen drives. It doesn’t stop SQL injection. Application-layer breaches read decrypted data through legitimate access paths.
The algorithm is never the problem. AES-256-GCM is the standard. Key management is where every system actually breaks.
Envelope encryption limits blast radius. One compromised data key exposes one dataset, not everything encrypted under a single master key.
Key rotation must be automated from day one. Manual rotation schedules slip. Automated rotation with 90-day expiry runs whether anyone remembers or not.
Column-level encryption protects sensitive fields even from DBAs. Encrypt PII at the application layer before it reaches the database.

The NIST Cryptographic Standards algorithm is never the weak link. Key management is. Where keys are stored, how long they live, who has access, what the blast radius looks like when one leaks. These are engineering questions, not policy questions, and the answers determine whether encryption actually protects anything.

The Key Hierarchy

A chain of trust with three links. Get any one wrong and the others don’t matter.

The OWASP Cryptographic Failures category (A02:2021) documents these key management failures as the #2 most common web application vulnerability class. The root key sits in a HSM or cloud KMS. It never touches data directly. Its only purpose: encrypting the keys that encrypt other keys.

DEKs (Data Encryption Keys) sit one level down. Generated per-object, per-table, or per-tenant depending on your isolation needs. DEKs encrypt the actual data. The CMK encrypts the DEKs. The encrypted DEK gets stored right alongside the ciphertext. A compromised DEK exposes one dataset. A compromised CMK exposes DEKs, which you can re-wrap right away. Without envelope encryption, a single leaked key exposes everything.

The real payoff: key rotation without re-encrypting the data. Rotate the CMK and you re-wrap the DEKs (tiny key material). The actual data never moves. A 10TB database rotates in seconds, re-wrapping a few thousand DEKs, instead of days of I/O-intensive bulk re-encryption. Without envelope encryption, teams simply never rotate. They know they should. They can’t afford the downtime. So the keys sit unchanged for years.

# Envelope encryption: encrypt data with DEK, wrap DEK with KMS
import boto3
from cryptography.fernet import Fernet

kms = boto3.client('kms')

# Generate a DEK via KMS - plaintext for use, ciphertext for storage
key_response = kms.generate_data_key(KeyId='alias/app-cmk', KeySpec='AES_256')
dek_plaintext = key_response['Plaintext']
dek_encrypted = key_response['CiphertextBlob']  # Store this alongside data

# Encrypt the sensitive field
cipher = Fernet(base64.urlsafe_b64encode(dek_plaintext))
encrypted_ssn = cipher.encrypt(b"123-45-6789")

# Store: encrypted_ssn + dek_encrypted (never store dek_plaintext)
# To decrypt: KMS unwraps dek_encrypted → use plaintext DEK → decrypt field

Perfect key management protects stored data. But the threat model doesn’t stop at the storage layer.

Field-Level Encryption for Sensitive Data

The Encryption Layer Gap The distance between where data is encrypted and where it’s attacked. Disk encryption protects the physical layer. Application-layer attacks operate above it. Field-level encryption closes the gap by encrypting data before it ever reaches the database.

Encryption Layer	Protects Against	Does NOT Protect Against	Use For
Disk/volume (EBS, gp3)	Physical theft, decommissioned drives	SQL injection, compromised credentials, replicas	Baseline. All storage.
TLS in transit	Network sniffing, MITM	Authenticated attackers, application bugs	All connections
Field-level (app layer)	DB dumps, SQL injection, replica exposure	Key compromise, application memory inspection	PII: SSN, payment, health
Envelope (KMS + DEK)	Key exposure (only DEK exposed, rotatable)	KMS compromise, IAM misconfiguration	All field-level encryption
Confidential computing (TEE)	Cloud provider access, memory inspection	Side-channel attacks, app-level bugs	Multi-party computation, regulated AI

For regulated PII, specifically Social Security numbers, payment card data, and biometric identifiers, the application encrypts the value before it touches the database. The database stores ciphertext. A full dump without the application’s keys is useless. Each safety deposit box has its own lock. Break into the building, you still can’t open them. Exactly the point.

The trade-offs are real though. Encrypted fields can’t be queried with normal SQL. No sorting. No range queries. No aggregations on ciphertext. You can’t search the contents of a locked box without opening it. Deterministic encryption allows equality lookups (“find by SSN”) but enables correlation attacks. Order-preserving encryption supports ranges but at a security cost too high for sensitive data. Most implementations encrypt a targeted subset of fields based on regulatory classification, not every column. The data privacy by design guide covers which fields actually warrant field-level protection.

Data Category	Examples	Encryption Approach	Queryable?	Compliance Driver
Regulated PII	SSN, Tax ID, biometric identifiers	Field-level AES-256-GCM, encrypted in application tier before DB write	No (ciphertext only)	SOC 2, GDPR, HIPAA
Payment Data	Card numbers, bank accounts	Field-level AES-256-GCM, tokenization for recurring use	Via token lookup	PCI DSS
Protected Health	Diagnosis codes, health records	Field-level AES-256-GCM, access-logged decryption	No (ciphertext only)	HIPAA, HITECH
Operational PII	Email, phone, user profiles	Storage-layer encryption (transparent), TLS in transit	Yes (transparent)	GDPR, CCPA
Non-sensitive	Preferences, settings, public content	Storage-layer encryption (transparent)	Yes (transparent)	Best practice

Anti-pattern

Don’t: Encrypt every column at the application layer. Full-database field-level encryption introduces query limits, performance overhead, and operational complexity that most columns don’t warrant. Putting every item in a safety deposit box when most of them are magazines.

Do: Encrypt the 10 most sensitive columns (SSNs, payment card numbers, health identifiers, authentication secrets) at the application layer. Everything else gets disk encryption and access controls. The valuables go in the vault. The staplers stay on the desk.

Rotation Without Downtime

Key rotation is where encryption goes to die.

Systems that treat keys as permanent infrastructure either never rotate or cause outages when they try. Both outcomes are bad. One is just quieter about it. (Not quieter forever. Just quieter until the audit.)

Prerequisites

Application supports versioned key identifiers in the encrypted payload header
KMS/HSM supports generating new key versions without revoking old ones
Data records include a key version indicator alongside the encrypted DEK
Monitoring alerts when old-version decryption requests exceed a threshold after migration window
Rollback procedure tested: application can revert to previous key version within minutes

Dual-key support solves the transition problem. The application keeps a versioned key list, tries the current key first, falls back to the previous version on failure, and lazily re-encrypts on successful old-key reads. No data goes dark during rotation. The old key phases out naturally as records get re-encrypted through normal operations. The transition is invisible to users.

Lazy vs. eager re-encryption: when to use each

Lazy re-encryption (on read) works well for datasets with high read rates. Records get re-encrypted naturally as the application accesses them. Downsides: cold records may never be re-encrypted, and you can’t retire the old key until every record has been read. Works best when most records are accessed within the rotation window.

Eager re-encryption (batch migration) processes records actively. Run a background job that reads, decrypts with the old key, re-encrypts with the new key, and writes back. Necessary for compliance scenarios with hard key retirement deadlines. The batch job needs rate limiting to avoid overwhelming the database with write amplification.

Hybrid approach combines both: lazy re-encryption for active data plus a batch sweep for stragglers after 80% of the rotation window. This catches cold records without running a full migration from the start.

Performance and Architecture Trade-offs

AES-256-GCM with hardware acceleration (AES-NI) is fast. Sub-microsecond per operation. You won’t notice it in isolation. KMS API calls are a completely different story.

Every decryption that calls KMS to unwrap a DEK adds 5-15ms of network latency. A page decrypting 20 fields? That’s 100-300ms if you call KMS for each one. Your users will notice. Your product team will find you.

Caching Strategy	Latency (20 fields)	Security Posture	Best For
No cache (KMS call per decrypt)	100-300ms	Maximum (no keys in memory)	Low-volume, high-sensitivity
Request-scoped cache	15-45ms	High (keys cleared per request)	Most production apps
TTL cache (5 min LRU)	Sub-millisecond (warm)	Moderate (keys in memory briefly)	High-throughput, latency-sensitive
Never cache to disk	N/A	N/A	Absolute rule. No exceptions.

Cache decrypted DEKs in application memory with a short TTL. Yes, this is a deliberate security trade-off. Plaintext DEKs in memory are theoretically accessible via memory dump. Five minutes with LRU eviction is reasonable for most production applications. Single-request caching with immediate clearing is the most conservative approach.

DEK Caching Strategy	KMS Calls	Latency (20 encrypted fields)	Security Exposure	Recommendation
No cache	Every decrypt calls KMS	+100-300ms (5-15ms per call x 20)	Minimal. Keys never in memory	Development only. Too slow for production
Request-scoped cache	1 call per unique DEK per request	+15-45ms (3 DEKs typical)	Request lifetime only	Good default for most services
TTL cache (5 min)	KMS call only on miss or expiry	Sub-millisecond (warm cache)	5-minute window if memory compromised	Best performance. Use for high-throughput paths
Disk cache	Never	Zero	Permanent exposure if disk accessed	Never. DEKs must not touch persistent storage

One thing teams consistently miss: encrypted fields need special handling in analytics pipelines, data lakes, and ML training. If your warehouse receives field-level encrypted columns, analytics can’t aggregate without a decryption step. The locked boxes go to the analysis department. They can’t open them. Build encryption into the data architecture from the start. Two years later, the architecture is concrete. For key management across hundreds of services, see secrets management at scale . Application security covers wiring field-level encryption without coupling key management to business logic.

What the Industry Gets Wrong About Data Encryption

“Encrypt at rest and you’re covered.” Disk encryption protects against one threat: physical media theft. An attacker with database credentials, a SQL injection, or access to a database replica reads decrypted data through the application’s own access path. Disk encryption is invisible to them. The vault door is locked. The attacker walked in through the lobby.

“HSMs are required for serious encryption.” For most organizations, cloud KMS with HSM-backed storage gives the same security at a fraction of the operational cost. HSMs are required for PCI DSS Level 1, FIPS 140-2 Level 3, or non-exportable root CA keys. For everything else, KMS is the pragmatic choice. You don’t need a bank vault for the petty cash drawer.

“Encryption makes data unrecoverable if you lose the keys.” With envelope encryption, losing a DEK affects one dataset. Losing the CMK is catastrophic, which is why KMS services replicate root keys across multiple availability zones with automatic failover. The real risk isn’t key loss. It’s key sprawl: hundreds of untracked DEKs across services with no inventory. Not losing the keys. Forgetting which keys go to which boxes.

Our take Encrypt the 10 most sensitive columns, not everything. SSNs, payment card numbers, health identifiers, authentication secrets. Those get application-layer encryption. Everything else gets disk encryption and access controls. Trying to encrypt every column kills query flexibility and adds operational weight that most data doesn’t warrant. The valuables in the vault. The rest behind a locked door. Good enough and actually maintainable beats perfect and abandoned.

That SQL injection from the opening. Full disk encryption didn’t stop it. Field-level encryption on the PII columns would have. Envelope encryption for fast key rotation. A caching layer that made the performance cost invisible. The building had a strong front door. The safety deposit boxes had their own locks. The attacker got into the lobby but left with ciphertext they couldn’t read. The algorithm was never the problem. The architecture around it was everything.

Frequently Asked Questions

What is envelope encryption and why is it the standard pattern?

Envelope encryption uses two key layers. A Data Encryption Key (DEK) encrypts the data. A Key Encryption Key (KEK) encrypts the DEK. The encrypted DEK is stored alongside the ciphertext. To decrypt, unwrap the DEK via the KMS, then decrypt data with the DEK. This keeps the KMS handling small key material only and lets you rotate keys without re-encrypting terabytes of data.

What is the difference between disk encryption and application-level encryption?

Disk encryption protects against physical disk theft but does nothing if an attacker has authenticated database access. Application-level encryption means an attacker who dumps the entire database gets ciphertext they can’t use without the application’s keys. For SSNs, payment data, and healthcare identifiers, disk encryption alone is not enough.

When do you actually need an HSM for key management?

HSMs are required when compliance mandates hardware-backed key protection (PCI DSS Level 1, FIPS 140-2 Level 3), when you need non-exportable private keys for PKI root CAs, or when the threat model demands proof keys never existed in software memory. For most organizations, cloud KMS with HSM-backed storage (AWS KMS, Google Cloud KMS) gives the same security at a fraction of the operational cost.

How do you rotate encryption keys without service downtime?

Keep dual-key support during rotation. Generate the new key, update the application to try the new key first and fall back to the old one, re-encrypt data gradually, verify completion, then retire the old key. Lazy re-encryption on read works well for low-write workloads. For a 10TB database, CMK rotation takes seconds because you only re-wrap the DEKs, not the data.

What is confidential computing and when does it matter?

Confidential computing protects data while it’s being used via hardware Trusted Execution Environments (TEEs). Traditional encryption covers data at rest and in transit, but data must be decrypted in memory for processing. TEEs are relevant for multi-party computation, regulated data processing where cloud provider trust must be minimized, and AI inference on sensitive data. AWS Nitro Enclaves and Azure Confidential VMs are the production-ready options.