How to version APIs with API Gateway and Lambda


A common challenge when building APIs is supporting multiple live versions without letting the system turn into an unmaintainable mess.

You need to keep older versions running for existing clients, roll out new versions safely, and avoid breaking changes that might take down production. And you need to do all that without duplicating too much infrastructure or introducing spaghetti logic inside your code.

There’s no official pattern for versioning APIs in API Gateway + Lambda. API Gateway gives you stages and custom domains; Lambda gives you versions and aliases. But it’s up to you to stitch them together into a strategy that works.

In this post, I’ll walk through several practical approaches to API versioning in this setup, explain the trade-offs of each, and help you choose the one that best fits your needs.

Prefer duplicating Lambda functions over versions

A common question is: “Should I duplicate the Lambda function, or use Lambda versions and aliases?”

I always prefer duplicating the function. Here’s why:

1. IAM roles don’t version. If multiple function versions reference the same IAM role, you can easily break the old versions with permission changes for the new version.
2. Tracking and testing separate versions of your code as separate functions is much easier.

With most IaC tools, duplicating the function and its IAM role is straightforward. And thanks to pay-per-use pricing, there’s no penalty for this unless you’re using provisioned concurrency.

Technically, you can point different versions to different IAM roles, but this adds complexity, and most IaC tools don’t handle it well out of the box.

Option 1: URL-based Versioning

A very common approach is to add a version to the URL, e.g. myapi.com/v1 and myapi.com/v2.

You can do this using API Gateway stages or custom domain mappings. But I strongly recommend against using stages for versioning. Like Lambda versions, they add hidden complexity and are awkward to manage in IaC.

Instead, duplicate the API Gateway, Lambda functions and IAM roles and use custom domains or path mappings to route traffic to each version. It’s more explicit and gives you better separation.

Pros

  • It’s clear to the client which version they’re using.
  • It’s easy to deprecate the old versions.

Cons

  • More infrastructure to configure (but pay-per-use pricing keeps cost low unless using provisioned concurrency)

Option 2: Custom HTTP header/query param

Instead of changing the URL, you can version via a custom HTTP header (e.g. X-API-Version: 2) or a query string parameter.

This works well if you want to default everyone to a specific version but give early adopters a way to opt into a newer one.

However, since multiple versions share the same route, API Gateway’s built-in metrics are no longer enough. You’ll need to emit custom metrics (e.g. using EMF) for fine-grained metrics & alerts.

Pros

  • Minimal infrastructure duplication.
  • Clean URLs.
  • Easy to implement opt-in rollouts.

Cons

  • More logic in the code to handle routing.
  • You need to keep old logic around potentially forever.
  • Requires custom metrics for per-version insights.

Option 3: Lambdalith

A Lambdalith is a monolithic application deployed as a single Lambda function, often using the Lambda Web Adapter [1].

It supports both URL and header/query param-based versioning. All routing happens inside your code – no duplicated infrastructure, but more internal complexity.

For a deeper dive into Lambdaliths, check out this earlier article [2]. Here, I’ll just focus on versioning implications.

Pros

  • No duplicated infrastructure.
  • Flexibility – supports different versioning schemes.

Cons

  • More logic in the code to handle routing.
  • You need to keep old logic around potentially forever.
  • Requires custom metrics for per-version insights.

Option 4: No breaking changes!

Honestly, most versioning strategies suck… and I often find myself coming back to a simpler rule: just don’t introduce breaking changes.

It’s the same advice I gave for versioning events in event-driven architectures [3].

No breaking changes, no need for versioning.

Never remove/rename existing fields in your API contract, and never change the data type of existing fields. Instead, always add new fields.

Pros

  • No need for versioning.
  • Clients keep working if they only rely on known fields.

Cons

  • Requires discipline and strong contract testing (e.g. with Pact [4]).
  • Not guaranteed to work forever. Sometimes you are forced to rename/remove fields.

How to choose the right approach

To pick the right strategy, ask yourself:

  • How different are the versions?
  • Do they need a clean separation?
  • How long do you need to support old versions?
  • Do you want a default version, or force clients to choose?
  • Do you want infrastructure separation or a single deployment?

For example, if v2 is a major rewrite and you need to support v1 long-term, separate APIs with custom domains may be worth the extra infra.

But if v2 is mostly compatible, routing by header inside a single Lambda might be simpler.

Conclusion

As much as I prefer avoiding breaking changes, sometimes you just can’t.

So it’s worth thinking ahead about how you’ll handle versioning – even if you hope not to use it.

There’s no one-size-fits-all answer. Use the questions above to weigh your options and pick the strategy that fits your product, team, and constraints.

Links

[1] Lambda Web Adapter

[2] The pros and cons of Lambdaliths

[3] Event versioning strategies for event-driven architectures

[4] Pact, for consumer-driven contract testing

Master Serverless

Join 17K readers and level up you AWS game with just 5 mins a week.

Read more from Master Serverless

Modern applications rarely do just one thing at a time. An API request creates an order, and then another service needs to reserve stock, another to charge the customer, another to send an email, and so on. In a serverless or event-driven architecture, follow-up actions are usually triggered by messages (either events or commands). That gives us loose coupling, better scalability, and independent services. But it also introduces a reliability problem. “What happens when the database update...

If you use Claude Code a lot, you’ve probably run into usage limits, sometimes even in short coding sessions. But cost isn’t the only problem. In long-running sessions, the context window eventually fills up, and that can cause the agent to forget earlier decisions, lose important details, or come back from compaction with gaps in its working memory. Here are three tools worth checking out if you want to reduce token usage and make longer coding sessions possible. 1. CavemanThis is a Claude...

AI agents can now scan an entire open-source codebase for exploitable vulnerabilities in hours. Frontier models carry the complete library of known bug classes in their weights. So you can simply point an AI agent at a codebase and tell it to find zero-days. This isn't theoretical. Willy Tarreau, the HAProxy lead developer, reports that security bug reports have jumped from 2–3 per week to 5–10 per day. Greg Kroah-Hartman, the Linux kernel maintainer, described what happened: "Months ago, we...