|
Lambda Durable Functions comes with a handy testing SDK. It makes it easy to test durable executions both locally as well as remotely in the cloud. I find the local test runner particular useful for dealing with wait states because I can simply configure the runner to skip time! However, this does not work for callback operations such as waitForCallback. Unfortunately, the official docs didn't include any examples on how to handle this. So here's my workaround. The handler codeImagine you're implement an order processing workflow for a food delivery service. When an order comes in, you would: 1) save the order details in the database. 2) publish an "order_placed" event to EventBridge. 3) send a notification to the restaurant and wait for them to confirm whether they accept the order or not. This last step is very important, and there are many reasons for a restaurant to reject an order. They might have run out of ingredients for the dish. Or maybe they are overbooked and don't have the bandwidth to take the order. Equally, we need to put a timeout on the notification because we can't keep the customer waiting forever. So your handler code might look something like this: The notifyRestaurant function will send a push notification to the restaurant's POS device with the relevant information about the order. We want to test the three possible scenarios here:
The test codeThe first challenge with testing callbacks is that, if you follow all the online examples with: Then the execution will hang as it waits for the callback. Unfortunately, setting "skipTime" to true doesn't help here. Because it only applies to wait operations, not callbacks. Instead, you must start the execution and let it run in the background (without awaiting). For example... The next challenge is that, if you try to fetch the "Notify restaurant" callback operation right away, it might not be available yet. This might be because the preceding operations have not completed yet. In that case, the following lines to fetch the "Notify restaurant" operation will throw an error. If that's the case, you might have to wait a bit and let the execution reach the "Notify restaurant" step first. Here's what that might look like. So that covers the test cases for the restaurant accepting and rejecting the order. But what about the timeout path? We wouldn't want to wait 10 mins for the callback to timeout naturally. In this specific case, where:
So to trigger the timeout path, we can send a failure response to the callback to drive the execution towards the timeout path. This feels like a hack, but it's the best I have come up with. Please let me know if you know of a better way to do this! What can be improved?AWS has done a great job with Lambda Durable Functions and the testing SDK is generally easy to use. But as you have seen, testing callbacks is still tricky. I have shown you my workaround but it's not a satisfying solution. It'd be much better if I can provide a delegate function to the LocalDurableTestRunner to provide the callback when requested. For example, maybe something like this: Again, if you can think of something better, please let me know! |
Join 17K readers and level up you AWS game with just 5 mins a week.
Modern applications rarely do just one thing at a time. An API request creates an order, and then another service needs to reserve stock, another to charge the customer, another to send an email, and so on. In a serverless or event-driven architecture, follow-up actions are usually triggered by messages (either events or commands). That gives us loose coupling, better scalability, and independent services. But it also introduces a reliability problem. “What happens when the database update...
If you use Claude Code a lot, you’ve probably run into usage limits, sometimes even in short coding sessions. But cost isn’t the only problem. In long-running sessions, the context window eventually fills up, and that can cause the agent to forget earlier decisions, lose important details, or come back from compaction with gaps in its working memory. Here are three tools worth checking out if you want to reduce token usage and make longer coding sessions possible. 1. CavemanThis is a Claude...
AI agents can now scan an entire open-source codebase for exploitable vulnerabilities in hours. Frontier models carry the complete library of known bug classes in their weights. So you can simply point an AI agent at a codebase and tell it to find zero-days. This isn't theoretical. Willy Tarreau, the HAProxy lead developer, reports that security bug reports have jumped from 2–3 per week to 5–10 per day. Greg Kroah-Hartman, the Linux kernel maintainer, described what happened: "Months ago, we...