Legacy API Modernization: Wrap Before You Rewrite
You grep the WSDL for the third time, scrolling through 4,000 lines of XML schema definitions. Somewhere in there is the field mapping that explains why the legacy billing API returns a customerType of 7 for enterprise accounts but the documentation says it should be 3. There is no documentation. The developer who built this endpoint left four years ago. The Confluence page titled “Billing API Spec” was last updated when the API still ran on a different application server. You check the page history. The last editor also left.
This is what legacy API modernization actually looks like. Not a clean architectural diagram with “old” on the left and “new” on the right. It is reverse-engineering systems where the code is the only source of truth, the protocols are outdated, and the business logic is encoded in behaviors nobody intended to be permanent but have been running transactions for years.
The instinct is to rewrite. Start fresh. Build it right this time with REST, OpenAPI specs, proper versioning. That instinct is wrong. Ground-up API rewrites fail at a staggering rate. The new implementation covers the documented behavior. It misses the undocumented behavior that the business actually depends on. Six months in, the team is still discovering edge cases. The legacy system is still running because nothing else handles the 47 integration partners that depend on its exact response format.
There is a better approach, and it starts with a facade.
The Facade Pattern: Modernize the Interface, Not the Implementation
The API facade pattern inserts a translation layer between your consumers and your legacy backend. External callers hit modern REST or GraphQL endpoints. The facade translates those requests into whatever protocol the legacy system speaks. SOAP, XML-RPC, proprietary binary formats. It translates the responses back.
This sounds trivially simple. It is architecturally simple. The complexity is in the details: field mapping, error translation, authentication bridging, and the hundred small behaviors that make an API “compatible.” But here is why the facade wins: that complexity is bounded and incremental. You handle it one endpoint at a time, with production traffic validating each translation. A rewrite forces you to handle all the complexity at once, under deadline pressure, with no safety net.
The critical insight: the facade does not need to understand the legacy system’s business logic. It translates protocols and maps data structures. The legacy system still executes the business rules. This is what makes facades safe and rewrites dangerous. A rewrite must replicate business logic it may not fully understand. The facade just passes it through.
Before you can build the facade, though, you need to know what the legacy API actually does. Not what the spec says. What it does.
Legacy API Archaeology: Reverse-Engineering the Undocumented
Before you can wrap a legacy API, you need to know what it actually does. Not what the spec says. What it actually does in production with real traffic.
Start with traffic capture. Deploy a proxy (Envoy, NGINX, or even tcpdump for desperate situations) that logs every request and response to the legacy endpoints over a 2-week period. You need a full business cycle. Weekly batch jobs, monthly reconciliation processes, quarterly reports. These all hit different API paths and parameters, and missing any of them will bite you later.
Parse that traffic into a behavioral spec. For each endpoint, document: the actual request parameters observed (not just the ones in the WSDL), the response field usage patterns, error codes and their triggers, and the undocumented behaviors (fields that change meaning based on other fields, responses that vary by caller identity, side effects like audit logging or event emission).
The traffic capture will reveal surprises. It always does. Endpoints in the WSDL that nobody calls. Parameters documented as “required” that are sent empty by half the callers and the system handles it fine. Response fields that contain completely different data depending on a header flag that no documentation mentions. Every legacy system has these skeletons. The question is whether you discover them during archaeology or during a production outage.
This archaeology phase typically takes 3-6 weeks for a system with 50-100 endpoints. It feels slow. It is faster than discovering undocumented behavior in production after you have already deployed the facade. Teams that skip this phase spend 2-3x longer debugging facade translation issues. Teams learn this the hard way often enough to call it a rule: invest in archaeology or pay triple in production debugging.
With the behavioral catalog in hand, the real translation work begins.
Protocol Translation: SOAP to REST Is Not Just Format Conversion
The most common legacy API migration is SOAP-to-REST. Teams think this is just XML-to-JSON conversion. It is not. SOAP and REST encode fundamentally different assumptions about how APIs work.
SOAP uses operation-based routing. Every request hits the same URL. The operation is embedded in the SOAPAction header or the XML body. REST uses resource-based routing with HTTP verbs. Translating between these models means mapping SOAP operations to REST resources, which is a design decision for every endpoint.
GetCustomerByID maps cleanly to GET /customers/{id}. But ProcessPaymentWithDiscountAndNotification does not map to a single REST resource. That SOAP operation is doing three things: applying a discount, processing a payment, and sending a notification. The facade needs to either expose this as a single POST endpoint (pragmatic but not RESTful) or decompose it into three calls that the facade orchestrates (clean but adds latency and failure modes).
Data type mapping is another hidden complexity that will eat your schedule if you underestimate it. SOAP’s xsd:dateTime might be serialized as 2024-03-15T14:30:00+05:30 in the legacy response. Your REST consumers expect ISO 8601 in UTC. The facade handles that conversion. But edge cases lurk: nullable dates represented as empty strings versus omitted fields, timezone handling when the legacy system stores local time without offset, and date ranges where the legacy API uses inclusive end dates but your REST convention uses exclusive. Every one of these becomes a bug report if you miss it.
Do not overthink the initial mapping. Start with pragmatic 1:1 operation mapping. A SOAP operation becomes a single REST endpoint with the same semantics. Refactor toward proper REST resource modeling in later iterations, once the facade is stable and you understand the actual usage patterns. Purity can wait. Correctness cannot.
Traffic Splitting: Shadow, Canary, and the Migration Ratchet
Once the facade is serving production traffic against the legacy backend, you can start the actual migration. Traffic splitting is how you do this without waking up at 3 AM.
Shadow traffic runs first. Always. The facade sends every request to both the legacy backend and the new implementation. Only the legacy response goes back to the caller. The new implementation’s response is logged and compared asynchronously. Mismatches surface behavioral differences you missed during the archaeology phase. And there will be mismatches. Expect them.
Shadow traffic comparison needs to account for non-deterministic fields: timestamps, generated IDs, session tokens. Build a comparator that ignores these fields and focuses on business-meaningful data: amounts, statuses, entity relationships. A well-tuned comparator catches 95% of behavioral mismatches while keeping the noise ratio below 5%.
Canary rollout follows shadow. Route 1% of production traffic to the new implementation and serve its responses. Monitor error rates, latency percentiles, and business metrics (conversion rates, payment success rates) compared to the legacy cohort. If metrics hold steady for 48 hours, ramp to 5%, then 10%, then 25%, then 50%, then 100%.
The migration ratchet works like this: once an endpoint reaches 100% on the new implementation and runs clean for 2 weeks, mark the legacy endpoint as deprecated. After all endpoints that depend on a legacy service are migrated, decommission the service. The strangler fig pattern applies the same principle at the service level. Applied to API layers, the ratchet is tighter because each endpoint migrates independently. Slow and boring beats fast and terrifying.
The next challenge is authentication. Legacy and modern systems rarely speak the same auth protocol.
Authentication Bridging: SAML to OIDC Without a Flag Day
Legacy systems commonly use SAML for federation, session cookies for web clients, and API keys for service-to-service calls. Modern systems expect OIDC tokens (JWTs). Forcing all consumers to migrate authentication simultaneously is a “flag day.” Do not do flag days. They are almost always impractical when you have dozens of integration partners.
The token exchange service validates legacy credentials and issues short-lived JWTs with equivalent claims and scopes. The facade attaches both the original credential and the new JWT during the transition period. Legacy backends authenticate with the original credential. New services authenticate with the JWT. Neither system needs to know about the other’s authentication mechanism.
This bridge typically runs for 3-6 months. During that window, integration partners migrate to OIDC at their own pace. Partners that are slow to migrate keep working through the bridge. There is no deadline pressure that forces a risky coordinated cutover. The bridge buys you time, and time is what makes migrations safe.
Authentication handled, the next concern is keeping the legacy backend alive under traffic it was never designed for.
Rate Limiting: Protecting Backends That Cannot Scale
Legacy backends were designed for predictable traffic patterns. The batch job runs nightly. The web application serves a known user base. Traffic is steady and foreseeable. Now expose that backend through a modern API consumed by mobile apps, partner integrations, and SPAs that retry aggressively on failure. You will quickly exceed what the legacy system can handle. Legacy backends that ran fine for a decade fall over within hours of being exposed through a modern facade without rate limiting.
The facade must implement rate limiting that protects the legacy backend while providing a reasonable experience to modern consumers. This is more nuanced than a simple requests-per-second cap.
Adaptive concurrency limiting adjusts the allowed request rate based on the backend’s actual response times. When the legacy system starts responding slowly (latency above p95 baseline), the facade reduces the concurrency window. This catches degradation before it becomes an outage. Netflix’s concurrency-limits library implements adaptive algorithms (Vegas, Gradient) that track backend latency and adjust automatically.
Response caching for idempotent GET endpoints can reduce backend load by 80-90%. Cache invalidation is the hard part. Legacy systems rarely emit change events. A reasonable starting point: time-based TTL for reference data (customer profiles, product catalogs) with a cache-bust header that internal systems can use for immediate invalidation.
Priority queuing for writes ensures critical operations (payment processing, order creation) get through even when the legacy system is at capacity. Lower-priority operations (report generation, analytics queries) wait in the queue. This requires classifying API operations by business priority - a conversation with the product team, not just an engineering decision.
Properly scoped microservice architecture allows these concerns to live in dedicated sidecar proxies or gateway middleware rather than being embedded in application code. When rate limiting logic is in the facade, changing thresholds is a configuration update. When it’s in application code, it’s a deployment.
Monitoring Dual-Stack Operations
Rate limiting protects the backend. Monitoring tells you whether the whole dual-stack operation is actually working.
Running legacy and modern implementations simultaneously creates a monitoring challenge most teams underestimate. You need to track: facade translation accuracy (are responses equivalent?), traffic split ratios and their business impact, latency overhead added by the facade, legacy backend health under changing traffic patterns, and authentication bridge success rates.
The most important metric during migration is the shadow comparison mismatch rate. A sustained mismatch rate above 1% means the new implementation has behavioral differences from the legacy system. These need investigation before ramping canary traffic.
Build a migration dashboard that shows, per endpoint: current traffic split percentage, p50/p95/p99 latency for both legacy and modern paths, error rate comparison between paths, and shadow mismatch count (if still in shadow phase). This dashboard is how you make ratchet decisions. If all metrics look equivalent at 25% canary, ramp to 50%. If mismatch rate spikes, halt and investigate.
Robust cloud infrastructure supports this dual-stack monitoring without requiring separate tooling stacks for legacy and modern systems. A unified observability layer that can correlate facade requests with both legacy and modern backend traces reduces investigation time from hours to minutes during migration incidents.
The facade pattern is not glamorous. It does not produce the satisfying feeling of deleting legacy code and replacing it with clean modern implementations. Nobody writes blog posts about their facade deployment. But it works. It keeps the legacy system running while you incrementally prove that the new implementation handles every edge case, every undocumented behavior, and every integration partner’s quirks. When you finally turn off the legacy backend, it is because you have already been running without it for weeks and nobody noticed. That is not anticlimactic. That is the goal.