|
Lambda Durable Functions makes it easy to implement business workflows using plain Lambda functions. Besides the intended use cases, they also let us implement ETL jobs without needing recursions or Step Functions. Many long-running ETL jobs have a time-consuming, sequential steps that cannot be easily parallelised. For example:
Historically, Lambda was not a good fit for these workloads because of its max 15 mins execution time. However, that's no longer a problem with Lambda Durable Functions because a "durable execution" can run for up to a year. BUT, you're still limited by the 15 mins execution time for individual invocations. So the trick is to use context.wait to slice the long-running durable execution into many invocations, each fits comfortably inside the 15 mins execution time limit. This works because every time you do context.wait, the durable execution exits and re-invokes the function after the specified time has passed (see below). Take the following code for example. The items need to be processed in order. Collectively, they might take more than 15 mins to process. However, by breaking them into more manageable batches, we ensure that each batch can be finished within the 15 mins limit. At the end of each batch, we use context.wait to exit the durable execution. The function is re-invoked after the 1 second delay and continues onto the next batch. Each batch is given a unique step name, so on replay, the system can fetch the correct result from the previous invocation. This also helps us understand what's going on when we look at the audit history. This is a nice pattern (no need for risky recursions!) and very easy to implement. However, there are a few things to keep in mind:
I hope you're experimenting with Lambda Durable Functions and I love to hear how you're using it. It's one of the best additions to Lambda in years! |
Join 17K readers and level up you AWS game with just 5 mins a week.
Step Functions is often used to poll long-running processes, e.g. when starting a new data migration task with Amazon Database Migration. There's usually a Wait -> Poll -> Choice loop that runs until the task is complete (or failed), like the one below. Polling is inefficient and can add unnecessary cost as standard workflows are charged based on the number of state transitions. There is an event-driven alternative to this approach. Here's the high level approach: To start the data migration,...
Lambda Durable Functions comes with a handy testing SDK. It makes it easy to test durable executions both locally as well as remotely in the cloud. I find the local test runner particular useful for dealing with wait states because I can simply configure the runner to skip time! However, this does not work for callback operations such as waitForCallback. Unfortunately, the official docs didn't include any examples on how to handle this. So here's my workaround. The handler code Imagine you're...
Lambda Durable Functions is a powerful new feature, but its checkpoint + replay model has a few gotchas. Here are five to watch out for. Non-deterministic code The biggest gotcha is when the code is not deterministic. That is, it might do something different during replay. Remember, when a durable execution is replayed, the handler code is executed from the start. So the code must behave exactly the same given the same input. If you use random numbers, or timestamps to make branching...