Everyone knows Canary Deployments, but do you know the Dark Read pattern?


Software systems are getting bigger and more complex. And we are constantly looking for ways to test code in production without risking user experience.

Canary deployments is a popular mechanism for rolling out changes incrementally, allowing us to limit the blast radius in case something goes wrong.

However, they’re not without limitations. Canary deployments essentially sacrifice a small portion of users for the greater good. But what if you want to gain insights without impacting any real users?

That's where the dark read pattern comes in.

Canary Deployments: a quick primer

Canary deployments let you release updates to a small subset of users while leaving the majority untouched. The process is simple:

  1. Deploy the new version to a small fraction of your users.
  2. Monitor it like a hawk. If all goes well, gradually ramp up until all users are on the latest version.
  3. If anything looks off, roll back quickly and minimize user impact.

Canary deployments work well for testing new features or configurations in real-world settings. However, they can impact the experience for users in the canary group. If something goes wrong, they will feel it.

The Dark Read pattern

The Dark Read pattern takes a fundamentally different approach.

It involves deploying the new version alongside the old one and executing both in parallel. The user request is served from the existing system, but the request is simultaneously executed against the new system to observe its behaviour and validate its response.

This way, you can see how the new code would perform if it were handling production traffic without impacting user experience.

Think of it as a “shadow test”. The goal is to see:

  • How the new code handles production-scale data.
  • Its performance under real traffic load.
  • Any potential functional deviations from the current system.

The Dark Read pattern in action

At DAZN, my team was responsible for rewriting the "schedule service". It's responsible for deciding what content the user sees on the home screen and is one of the most business-critical services.

Given the business criticality, we opted for the Dark Read pattern.

  • Every user request is captured and cloned to the new service.
  • Users continue to interact with the existing service as we validate the new service in terms of performance and correctness.
  • Compare the response from the new service against the old service in the background. Differences are flagged and reviewed.

We ran this for several weeks and were able to identify edge cases and fix bugs without impacting any users.

This pattern is very effective for backend services where the focus is on response accuracy, latency, and handling load rather than UI or frontend logic. You get the perks of testing in production without the direct risk to user experience.

Why Use Dark Read Over Canary?

1. No User Impacts
With canaries, if the new version misbehaves, some users feel it. Dark reads let you run the new code against production data without impacting real users at all. This is crucial for sensitive applications where even a small failure might cost revenue or reputation.

2. Ideal for Load Testing in Production
Canary deployments are limited to partial traffic by design, making them less ideal for testing how the new version scales at full production load. With dark reads, you can hit production levels of load against the new code without ever showing its output to end users.

3. More Extensive Validation
Dark reads allow for a range of tests—observing the new code’s correctness and performance across diverse real-world scenarios, including edge cases, without risking any regression affecting users.

4. Continuous Monitoring without Worrying about Rollback
With a canary, rollbacks can be taxing on the team and time-consuming, especially if the deployment is already far along. And by the time rollbacks happen, some users have already been affected. Dark reads run in parallel, so you can constantly monitor, adjust, and fix issues as they appear, reducing the need for urgent rollbacks.

Drawbacks

1. Increased Complexity

Running two versions of code in parallel adds architectural complexity and requires infrastructure for mirroring traffic, logging, and comparing results.

2. Applicable Only to Certain Types of Tests

Dark reads are great for validating logic and load handling but won’t help in testing UX, frontend changes, or how users interact with a new feature.

3. Additional Costs

Duplicating traffic and processing them twice leads to increased costs, especially under high traffic.

Conclusion

While the dark read pattern doesn’t replace canary deployments, it’s a useful tool to have in your arsenal.

Canary deployments provide controlled, real-world testing with an impact radius, while dark reads offer shadow testing without risking real-world effects.

For critical backend changes, database migrations, or performance improvements, dark reads enable deeper insights without risking real user impact.

Related posts

Master Serverless

Join 11K readers and level up you AWS game with just 5 mins a week. Every Monday, I share practical tips, tutorials and best practices for building serverless architectures on AWS.

Read more from Master Serverless

In security and access control, authentication and authorization are two distinct yet interconnected concepts. Authentication is the process of confirming the identity of a user or system, while authorization defines the actions that the authenticated user is permitted to perform within your system. Although API Gateway integrates directly with Cognito, it lacks built-in support for fine-grained authorization. In a previous article, we looked at implementing fine-grained authorization using a...

A common narrative is that one should always use access tokens to call your APIs, while ID tokens are strictly for identifying users. Some of it has come from this article by Auth0 [1], which makes a strong statement about using ID tokens: However, things are usually more nuanced. In some cases, using ID tokens instead of access tokens is both acceptable and pragmatic. Cognito User Pools might be one of these cases. Cost of using access tokens The common practice amongst Cognito users is to...

In security and access control, authentication and authorization mean two distinct but related things. Authentication verifies the identity of a user or system. Authorization determines what actions an authenticated user is allowed to perform in your system. API Gateway has built-in integration with Cognito, but it doesn’t provide any fine-grained authorization out-of-the-box. By default, a Cognito authorizer only checks if a user’s bearer token is valid and that the user belongs to the right...