Bye bye schema coupling, hello semantic coupling


I recently shared six event versioning strategies for event-driven architectures [1]. In response to this, Marty Pitt reached out and showed me how Orbital [2] and Taxi [3] use semantic tags to eliminate schema coupling in event-driven architectures and simplify the schema management.

It's a novel way to manage schema evolution, and I want to share what I learnt with you.

Problems with Schema Coupling

In an event-driven architecture, event consumers are typically coupled to the schema of the event payload, as it serves as the contract between the event publisher, the event bus, and the event consumers.

This is a form of schema coupling.

When you change the schema of the event (including the data type of its fields), the consumer must change accordingly.

Therefore, you need to carefully manage the evolution of the event schema to avoid breaking existing consumers. Hence, the need for versioning or to prevent breaking changes at all [4].

Semantic Coupling

What if consumers are coupled to the meaning of the data (i.e. the semantics) rather than its representation?

In the example below, the two event versions have different schemas. However, both "customerId" and "customer.id" refer to the same concept - a customer ID.

In Orbital, consumers subscribe to and query these semantic tags, rather than the event payload. When Orbital (the event gateway) delivers data to the consumers, it delivers them as semantic tags, not the raw events.

As you can see in the example above, as you evolve the event schema, you update the mapping accordingly. Consumers are unaware of the schema change because it is hidden from them.

But what if you remove a field or change its data type?

That's where "semantic functions" come in. It's a way to transform or enrich the raw event data.

If the event no longer carries the customer's name, then a semantic function can call out to an HTTP API and fetch the data from it.

However, it would be a waste to do this for every event if no consumers need the customer's name. So, semantic functions are only run on fields when a consumer has requested the field. This works similarly to GraphQL resolvers, which are lazily evaluated based on the user query.

Similarly, a semantic function can also be used to perform data transformation. For example, to split "full-name" into "first-name" and "last-name", or to convert "created-at" from DateTime to a string.

Finally, Orbital sits between data sources and consumers, and can work in both request-response and event-driven contexts.

The closest comparison in AWS is perhaps a combination of EventBridge and EventBridge Pipes.

Conclusion

Putting Orbital aside, I really like this approach of using semantic tags to decouple event consumers from the data representation.

Subscribing by meaning instead of by schema turns every change into a local mapping update. It eliminates the need for event versioning and prevents breaking changes at the same time.

I love the approach and its simplicity and I hope to see other tools take note of this approach and for it to become more widely adopted!

Links

[1] Event versioning strategies for event-driven architectures

[2] Orbital, a data integration platform

[3] Taxi, a language for describing how your data and services should connect together

[4] How to detect and prevent breaking changes in event schemas

Master Serverless

Join 17K readers and level up you AWS game with just 5 mins a week.

Read more from Master Serverless

Lambda Durable Functions is a powerful new feature, but its checkpoint + replay model has a few gotchas. Here are five to watch out for. Non-deterministic code The biggest gotcha is when the code is not deterministic. That is, it might do something different during replay. Remember, when a durable execution is replayed, the handler code is executed from the start. So the code must behave exactly the same given the same input. If you use random numbers, or timestamps to make branching...

Hi, I have just finished adding some content around Lambda Managed Instances (LMI) to my upcoming workshop. I put together a cheatsheet of the important ways that LMI is different from Lambda default and thought maybe you'd find it useful too. You can also download the PDF version below. Lambda default vs. Lambda managed instances.pdf

Two weeks ago, I gave you the biggest serverless announcements pre-re:Invent (see here). So here are the biggest serverless announcements during re:Invent 2025. Lambda Managed Instances Here’s the official announcement. A common pushback against Lambda is that “it’s expensive at scale” because: 1) Each execution environment can only process one request at a time, wasting available CPU cycles while you wait for IO response. 2) Paying for execution time is less efficient when handling thousands...