I recently shared six event versioning strategies for event-driven architectures [1]. In response to this, Marty Pitt reached out and showed me how Orbital [2] and Taxi [3] use semantic tags to eliminate schema coupling in event-driven architectures and simplify the schema management. It's a novel way to manage schema evolution, and I want to share what I learnt with you. Problems with Schema CouplingIn an event-driven architecture, event consumers are typically coupled to the schema of the event payload, as it serves as the contract between the event publisher, the event bus, and the event consumers. This is a form of schema coupling. When you change the schema of the event (including the data type of its fields), the consumer must change accordingly. Therefore, you need to carefully manage the evolution of the event schema to avoid breaking existing consumers. Hence, the need for versioning or to prevent breaking changes at all [4]. Semantic CouplingWhat if consumers are coupled to the meaning of the data (i.e. the semantics) rather than its representation? In the example below, the two event versions have different schemas. However, both "customerId" and "customer.id" refer to the same concept - a customer ID. In Orbital, consumers subscribe to and query these semantic tags, rather than the event payload. When Orbital (the event gateway) delivers data to the consumers, it delivers them as semantic tags, not the raw events. As you can see in the example above, as you evolve the event schema, you update the mapping accordingly. Consumers are unaware of the schema change because it is hidden from them. But what if you remove a field or change its data type? That's where "semantic functions" come in. It's a way to transform or enrich the raw event data. If the event no longer carries the customer's name, then a semantic function can call out to an HTTP API and fetch the data from it. However, it would be a waste to do this for every event if no consumers need the customer's name. So, semantic functions are only run on fields when a consumer has requested the field. This works similarly to GraphQL resolvers, which are lazily evaluated based on the user query. Similarly, a semantic function can also be used to perform data transformation. For example, to split "full-name" into "first-name" and "last-name", or to convert "created-at" from DateTime to a string. Finally, Orbital sits between data sources and consumers, and can work in both request-response and event-driven contexts. The closest comparison in AWS is perhaps a combination of EventBridge and EventBridge Pipes. ConclusionPutting Orbital aside, I really like this approach of using semantic tags to decouple event consumers from the data representation. Subscribing by meaning instead of by schema turns every change into a local mapping update. It eliminates the need for event versioning and prevents breaking changes at the same time. I love the approach and its simplicity and I hope to see other tools take note of this approach and for it to become more widely adopted! Links[1] Event versioning strategies for event-driven architectures [2] Orbital, a data integration platform [3] Taxi, a language for describing how your data and services should connect together [4] How to detect and prevent breaking changes in event schemas |
Join 15K readers and level up you AWS game with just 5 mins a week.
Last week, we looked at 6 ways to version event schemas [1] and found the best solution is to avoid breaking changes and minimise the need for versioning. But how exactly do you do that? How can you prevent accidental breaking changes from creeping in? You can detect and stop breaking changes: At runtime, when the events are ingested; During development, when schema changes are made; Or a combination of both! Here are three approaches you should consider. 1. Consumer-Driven Contracts In...
Synchronous API integrations create temporal coupling [1] between two services based on their respective availability. This is a tighter form of coupling and often necessitates techniques such as retries, exponential delay and fallbacks to compensate. Event-driven architectures, on the other hand, encourage loose coupling. But we are still bound by lessor forms of coupling such as schema coupling. And here lies a question that many students and clients have asked me: “How do I version my...
ICYMI, Serverless Inc. recently announced the Serverless Container Framework. It allows you to switch the compute platform between Lambda and Fargate with a one-liner config change. This is a game-changer for many organizations! It'd hopefully nullify many of the "lock-in" worries about Lambda, too. As your system grows, if Lambda gets expensive, you can easily switch to Fargate without changing your application code. To be clear, this is something you can already do yourself. It's not a...