First impressions of the Serverless Container Framework


ICYMI, Serverless Inc. recently announced the Serverless Container Framework. It allows you to switch the compute platform between Lambda and Fargate with a one-liner config change.

This is a game-changer for many organizations!

It'd hopefully nullify many of the "lock-in" worries about Lambda, too. As your system grows, if Lambda gets expensive, you can easily switch to Fargate without changing your application code.

To be clear, this is something you can already do yourself. It's not a "new" capability.

For example, you can achieve the same level of portability by using the Lambda Web Adapter with a Lambdalith (i.e. a monolithic Lambda function running a web framework such as express.js).

However, just because a solution exists doesn't mean it's widely known or easy to adopt. There is still a lot to figure out, for example:

  • How to make the Lambda Web Adapter work for your application?
  • How do you structure your application and make switching between Lambda and containers easy?
  • How do you ACTUALLY switch between the two? Do you do that in the IaC tool or a deployment script?
  • How do you make sure your CI/CD pipeline can support both?
  • Which container service to use?
  • Should you use API Gateway or ALB?

I have seen many clients and students struggle to make it work. Those who succeeded also had to invest a lot of development time in figuring out these questions.

The Serverless Containers Framework's value is that it eliminates the trial and error and allows you to focus on your application code.

I have tried it out myself, and it works as advertised. But I'd like to point your attention to a few design choices and (possible) bugs.

API only (for now).

At the time of writing, the framework only supports API workloads. This is the most obvious place to start because portability is needed the most and is the easiest to implement.

Thanks to batching, Lambda can be cost-efficient at processing background and asynchronous tasks at scale. Thus, such portability is less needed.

Also, many services (e.g., Kinesis Data Stream) can deliver events directly to Lambda but not to Fargate.

ALB instead of API Gateway.

Because API Gateway can't connect to Fargate directly, the framework uses ALB instead. This way, when you switch from Lambda to Fargate, the URL of your API stays the same.

Again, this is a sensible design choice.

However, it's worth noting that, ALB's uptime-based pricing is more costly in low throughput environments. Conversely, at higher throughput, ALB is significantly more cost-efficient than API Gateway.

But as I explained in this LinkedIn post, you need to consider the specific cost dimensions of a service. If your API processes large blobs of data, then you're likely to:

a) run into ALB's 1MB payload limit for Lambda targets

b) face much higher costs

So while the choice for the Serverless Container Framework, you need to make sure it makes sense for your application too.

No CloudFormation.

Unlike the Serverless Framework, the Serverless Container Framework no longer uses CloudFormation for deployment.

It's not clear to me why they made this choice. Perhaps because CloudFormation cannot manage some of the resources (e.g. the container images in ECR)?

Whatever the reason, this makes it harder to understand all the resources that the framework has created. Which might be a problem because 👇

"remove" doesn't delete everything.

Running "serverless remove" doesn't remove everything. It only removes the application - which seems to include the Lambda function and the Fargate task, not the VPC, ALB and Fargate cluster.

Running "serverless remove --force" is supposed to remove all resources, but it doesn't!

At the end of my test run, these resources still lingered behind after I ran "serverless remove --force":

  • Fargate cluster
  • ALB
  • ECR repository
  • ECR container images
  • VPC and associated configurations

Always two container images.

By default, the framework always builds two container images - one for the Lambda function and one for Fargate.

This makes deployments pretty slow when you have made code changes. However, when you're just switching compute platform, it's very fast!

This feels like an odd choice to me.

Performance-wise, it's sacrificing 99.999% of deployments for the one time when you need to switch compute platform without code change.

Maybe it's the one use case that the Serverless Inc. developers saw every day as they worked on the framework. But it's not the norm for anyone who'd be using the framework.

Is it worth the license fee?

Finally, the new Serverless Container Framework requires the Serverless CLI v4. It's therefore covered under v4's pricing model, which applies to organizations making $2M+ in annual revenue.

I have seen organizations waste tens of thousands dollars worth of engineering time trying to do this poorly. So, personally, I think the Serverless Container Framework is well worth its fee.

However, your perceived value of the framework depends on your ability (and time!) to figure things out yourself and coming up with a suitable solution for your organization. That is, assuming you need the ability to switch between Lambda and Fargate to begin with!

Master Serverless

Join 17K readers and level up you AWS game with just 5 mins a week.

Read more from Master Serverless

Step Functions is often used to poll long-running processes, e.g. when starting a new data migration task with Amazon Database Migration. There's usually a Wait -> Poll -> Choice loop that runs until the task is complete (or failed), like the one below. Polling is inefficient and can add unnecessary cost as standard workflows are charged based on the number of state transitions. There is an event-driven alternative to this approach. Here's the high level approach: To start the data migration,...

Lambda Durable Functions comes with a handy testing SDK. It makes it easy to test durable executions both locally as well as remotely in the cloud. I find the local test runner particular useful for dealing with wait states because I can simply configure the runner to skip time! However, this does not work for callback operations such as waitForCallback. Unfortunately, the official docs didn't include any examples on how to handle this. So here's my workaround. The handler code Imagine you're...

Lambda Durable Functions is a powerful new feature, but its checkpoint + replay model has a few gotchas. Here are five to watch out for. Non-deterministic code The biggest gotcha is when the code is not deterministic. That is, it might do something different during replay. Remember, when a durable execution is replayed, the handler code is executed from the start. So the code must behave exactly the same given the same input. If you use random numbers, or timestamps to make branching...