Read on lumigo blog
Read time: 6 minutes
Step Function added support for testing individual states [1] with the new TestState API [2]. Which lets you execute individual states with:
And returns the following:
With the TestState API, you can thoroughly test every state and achieve close to 100% coverage of a state machine.
So does this eliminate the need for Step Functions Local [3]?
Can we do away with end-to-end tests as well?
If not, then where should this new API fit into your workflow and how should you use it?
As I wrote previously [4], my strategy for testing Step Function uses a combination of:
The TestState API lets you test these hard-to-reach states directly. It should help you achieve better test coverage of your state machine with less effort.
However, it’s worth remembering that it’s not a local simulation tool. In most cases, it wouldn’t help you improve the speed of your feedback loop.
For example, if you’re testing a Lambda-based Task state, then the referenced Lambda function and the relevant IAM role need to be deployed first. Similarly, after you change the Lambda function, you have to deploy the change first before you can test the state.
Another good use case for TestState API is for testing input or output processing logic [5]. This includes when you modify the current input with the Pass state’s Result field.
Because the TestState API takes in the state definition as an argument, you do not have to redeploy the state machine after every change. Instead, you can iterate on your settings and test them by passing the modified state definition to the TestState API.
For example, take the Task 2 state from the imaginary state machine above:
We can write tests to make sure that:
nextState.Task 3 state.We need a way to fetch the definition of our state machine and the IAM role we should use. I like to encapsulate this into a given module, like this:
And we also need a way to call the TestState API with our state definition and input. I like to encapsulate this into a when module:
So I can keep my test code simple and easy to read.
(You can try out this demo project here [6])
I can write tests like this for every state in the state machine and cover every scenario.
However, as I mentioned before, both the Lambda function (used by the Task state) and the IAM role need to be deployed first. So your typical workflow would be as follows:
As you iterate on your state definitions and Lambda functions, how do you maintain a fast feedback loop? Can you avoid having to redeploy the project every time you make a change?
Yes, you can. That’s why we need a full suite of different tests.
Yes, you should still perform component-level testing on the Lambda functions involved.
Use “remocal testing” (i.e. execute the Lambda function code locally against remote AWS resources) to maintain a fast feedback loop as you iterate on your Lambda function.
As you iterate on your Lambda function, you can run these tests and execute the latest code locally. Because the code is executed locally, you don’t need to deploy them to the Lambda service.
But a Task state is more than just the Lambda function. There are input and output processing and there are error handling settings as well.
The TestState API helps you test these settings as we have seen in the example above.
Step Functions Local was best used to test execution paths that are difficult to reach, thanks to its mocking capability.
The ability to test individual states means this is no longer necessary.
Another potential use for Step Functions Local is so that you can iterate on your state machine locally without redeploying the project.
Unfortunately, this doesn’t work very well in practice.
Because your state machine likely depends on Lambda functions, SNS topics and other AWS resources. So you have to either provide a full simulation of all these resources (e.g. by running LocalStack [7]) or you still have to deploy your project first.
The same dynamic still exists with the TestState API.
But no, you don’t need to use Step Functions Local anymore.
End-to-end tests execute the state machine in the cloud and make sure everything works together. Before the TestState API, end-to-end tests played an important role in my test strategy.
They were the workhorse in my test suites.
From a test coverage point of view, you don’t need end-to-end tests anymore. You can achieve better test coverage with less effort by testing individual states with the TestState API.
However, it’s easy to lose sight of the forest when you only look at the individual trees.
I think there is still value in having end-to-end tests for business-critical execution paths. This is to ensure that all the individual states do indeed function together as a unit.
In a state machine, data flows from one state to the next. You need to make sure that if you change the output from Task #1 (see below) then you also change the conditions in Choice #2.
It’s easy to break the contract between Task #1 and Choice #2 when you’re testing them separately.
This is similar to the kind of integration problems that you often face in a microservices environment. In the context of a state machine, end-to-end tests can help you catch these “integration” problems early.
To summarise:
Task state, you still have to deploy the project first.[1] Official announcement blog post for the new TestState API
[2] The TestState API reference
[3] Step Functions Local
[4] A practical guide to testing AWS Step Functions
[5] Input and Output processing in Step Functions
[6] Demo project to illustrate how to use the TestState API
[7] LocalStack
Join 17K readers and level up you AWS game with just 5 mins a week.
Modern applications rarely do just one thing at a time. An API request creates an order, and then another service needs to reserve stock, another to charge the customer, another to send an email, and so on. In a serverless or event-driven architecture, follow-up actions are usually triggered by messages (either events or commands). That gives us loose coupling, better scalability, and independent services. But it also introduces a reliability problem. “What happens when the database update...
If you use Claude Code a lot, you’ve probably run into usage limits, sometimes even in short coding sessions. But cost isn’t the only problem. In long-running sessions, the context window eventually fills up, and that can cause the agent to forget earlier decisions, lose important details, or come back from compaction with gaps in its working memory. Here are three tools worth checking out if you want to reduce token usage and make longer coding sessions possible. 1. CavemanThis is a Claude...
AI agents can now scan an entire open-source codebase for exploitable vulnerabilities in hours. Frontier models carry the complete library of known bug classes in their weights. So you can simply point an AI agent at a codebase and tell it to find zero-days. This isn't theoretical. Willy Tarreau, the HAProxy lead developer, reports that security bug reports have jumped from 2–3 per week to 5–10 per day. Greg Kroah-Hartman, the Linux kernel maintainer, described what happened: "Months ago, we...