Multi-region Cognito User Pools: can we do it?


Cognito doesn't support multi-region user pools (yet), but maybe we can get there ourselves 🤔

Let's do a thought experiment together.

Proposed solution

An API Gateway authorizer has a "ProviderARNs" attribute, where you can reference more than one Cognito User Pools.

These are ARNs, so one assumes it can reference a user pool in another region.

Cognito's "Post confirmation" hook is fired after a signed-up user confirms their user account. This invokes a Lambda function with the "userAttributes" and "clientMetadata".

We can use the information in the payload to create the user in the other region.

BUT, we don't have the user's password.

If we DON'T use passwords, that will not be a problem. That is, we can go for passwordless authentication. For example, we can use one-time passwords [1] or magic links [2].

If we don't use passwords, we can set a random password when creating the user in the other region.

In the front end, the user can log in with either user pool. The user's JWT token can be used to access APIs in either region.

Handling failures

What if the Cognito service is down in one of the regions?

The whole point of going multi-region is to provide higher redundancy, but if Cognito is down in one region, then synchronization is broken.

What do we do then?

We introduce a fallback.

If we can't synchronously add a new user to the other region's user pool, we push a message in a SQS queue.

A Lambda function is subscribed to the queue and will retry the failed operation N times before moving the message to a DLQ for further investigation and manual retries.

Summary

This is how the solution will look at a high level.

There are a few constraints:

  • We have to use API Gateway REST APIs
  • We have to use passwordless authentication

Under these conditions, I think it's a workable solution.

Do you think this will work?

Do you see any flaws in this solution?

And would you like to see a POC for this solution?

Links

[1] Passwordless Authentication made easy with Cognito: a step-by-step guide​

[2] Implementing Magic Links with Amazon Cognito: A Step-by-Step Guide​

Master Serverless

Join 17K readers and level up you AWS game with just 5 mins a week.

Read more from Master Serverless

AI agents can now scan an entire open-source codebase for exploitable vulnerabilities in hours. Frontier models carry the complete library of known bug classes in their weights. So you can simply point an AI agent at a codebase and tell it to find zero-days. This isn't theoretical. Willy Tarreau, the HAProxy lead developer, reports that security bug reports have jumped from 2–3 per week to 5–10 per day. Greg Kroah-Hartman, the Linux kernel maintainer, described what happened: "Months ago, we...

Lambda Durable Functions makes it easy to implement business workflows using plain Lambda functions. Besides the intended use cases, they also let us implement ETL jobs without needing recursions or Step Functions. Many long-running ETL jobs have a time-consuming, sequential steps that cannot be easily parallelised. For example: Fetching data from shared databases/APIs with throughput limits. When data needs to be processed sequentially. Historically, Lambda was not a good fit for these...

Step Functions is often used to poll long-running processes, e.g. when starting a new data migration task with Amazon Database Migration. There's usually a Wait -> Poll -> Choice loop that runs until the task is complete (or failed), like the one below. Polling is inefficient and can add unnecessary cost as standard workflows are charged based on the number of state transitions. There is an event-driven alternative to this approach. Here's the high level approach: To start the data migration,...