Lambda Durable Functions: How to implement long-running ETL jobs

Lambda Durable Functions makes it easy to implement business workflows using plain Lambda functions.

Besides the intended use cases, they also let us implement ETL jobs without needing recursions or Step Functions.

Many long-running ETL jobs have a time-consuming, sequential steps that cannot be easily parallelised. For example:

Fetching data from shared databases/APIs with throughput limits.
When data needs to be processed sequentially.

Historically, Lambda was not a good fit for these workloads because of its max 15 mins execution time. However, that's no longer a problem with Lambda Durable Functions because a "durable execution" can run for up to a year.

BUT, you're still limited by the 15 mins execution time for individual invocations.

So the trick is to use context.wait to slice the long-running durable execution into many invocations, each fits comfortably inside the 15 mins execution time limit.

This works because every time you do context.wait, the durable execution exits and re-invokes the function after the specified time has passed (see below).

source: https://docs.aws.amazon.com/lambda/latest/dg/durable-functions.html

Take the following code for example.

The items need to be processed in order. Collectively, they might take more than 15 mins to process. However, by breaking them into more manageable batches, we ensure that each batch can be finished within the 15 mins limit.

At the end of each batch, we use context.wait to exit the durable execution. The function is re-invoked after the 1 second delay and continues onto the next batch.

Each batch is given a unique step name, so on replay, the system can fetch the correct result from the previous invocation.

This also helps us understand what's going on when we look at the audit history.

This is a nice pattern (no need for risky recursions!) and very easy to implement.

However, there are a few things to keep in mind:

Depending on the size of your task, it might consume a lot of billable operations for steps and waits. Durable functions charge $8 per million durable operations such as context.step and context.wait.

source: https://aws.amazon.com/lambda/pricing

You don't need a context.wait after every batch. Instead, let the current invocation run until there's only 1 min left. You can find out how much time is left in the current invocation with context.lambdaContext.getRemainingTimeInMillis().

I hope you're experimenting with Lambda Durable Functions and I love to hear how you're using it. It's one of the best additions to Lambda in years!

Master Serverless

Lambda Durable Functions: How to implement long-running ETL jobs

The anti-polling pattern for Step Functions

Lambda Durable Functions: how to test callbacks

Lambda Durable Functions: 5 gotchas to watch out for