Bye bye schema coupling, hello semantic coupling


I recently shared six event versioning strategies for event-driven architectures [1]. In response to this, Marty Pitt reached out and showed me how Orbital [2] and Taxi [3] use semantic tags to eliminate schema coupling in event-driven architectures and simplify the schema management.

It's a novel way to manage schema evolution, and I want to share what I learnt with you.

Problems with Schema Coupling

In an event-driven architecture, event consumers are typically coupled to the schema of the event payload, as it serves as the contract between the event publisher, the event bus, and the event consumers.

This is a form of schema coupling.

When you change the schema of the event (including the data type of its fields), the consumer must change accordingly.

Therefore, you need to carefully manage the evolution of the event schema to avoid breaking existing consumers. Hence, the need for versioning or to prevent breaking changes at all [4].

Semantic Coupling

What if consumers are coupled to the meaning of the data (i.e. the semantics) rather than its representation?

In the example below, the two event versions have different schemas. However, both "customerId" and "customer.id" refer to the same concept - a customer ID.

In Orbital, consumers subscribe to and query these semantic tags, rather than the event payload. When Orbital (the event gateway) delivers data to the consumers, it delivers them as semantic tags, not the raw events.

As you can see in the example above, as you evolve the event schema, you update the mapping accordingly. Consumers are unaware of the schema change because it is hidden from them.

But what if you remove a field or change its data type?

That's where "semantic functions" come in. It's a way to transform or enrich the raw event data.

If the event no longer carries the customer's name, then a semantic function can call out to an HTTP API and fetch the data from it.

However, it would be a waste to do this for every event if no consumers need the customer's name. So, semantic functions are only run on fields when a consumer has requested the field. This works similarly to GraphQL resolvers, which are lazily evaluated based on the user query.

Similarly, a semantic function can also be used to perform data transformation. For example, to split "full-name" into "first-name" and "last-name", or to convert "created-at" from DateTime to a string.

Finally, Orbital sits between data sources and consumers, and can work in both request-response and event-driven contexts.

The closest comparison in AWS is perhaps a combination of EventBridge and EventBridge Pipes.

Conclusion

Putting Orbital aside, I really like this approach of using semantic tags to decouple event consumers from the data representation.

Subscribing by meaning instead of by schema turns every change into a local mapping update. It eliminates the need for event versioning and prevents breaking changes at the same time.

I love the approach and its simplicity and I hope to see other tools take note of this approach and for it to become more widely adopted!

Links

[1] Event versioning strategies for event-driven architectures

[2] Orbital, a data integration platform

[3] Taxi, a language for describing how your data and services should connect together

[4] How to detect and prevent breaking changes in event schemas

Master Serverless

Join 17K readers and level up you AWS game with just 5 mins a week.

Read more from Master Serverless

Lambda Durable Functions makes it easy to implement business workflows using plain Lambda functions. Besides the intended use cases, they also let us implement ETL jobs without needing recursions or Step Functions. Many long-running ETL jobs have a time-consuming, sequential steps that cannot be easily parallelised. For example: Fetching data from shared databases/APIs with throughput limits. When data needs to be processed sequentially. Historically, Lambda was not a good fit for these...

Step Functions is often used to poll long-running processes, e.g. when starting a new data migration task with Amazon Database Migration. There's usually a Wait -> Poll -> Choice loop that runs until the task is complete (or failed), like the one below. Polling is inefficient and can add unnecessary cost as standard workflows are charged based on the number of state transitions. There is an event-driven alternative to this approach. Here's the high level approach: To start the data migration,...

Lambda Durable Functions comes with a handy testing SDK. It makes it easy to test durable executions both locally as well as remotely in the cloud. I find the local test runner particular useful for dealing with wait states because I can simply configure the runner to skip time! However, this does not work for callback operations such as waitForCallback. Unfortunately, the official docs didn't include any examples on how to handle this. So here's my workaround. The handler code Imagine you're...