How to Upgrade CQRS Events Without Busting Your Event Stream

Events are at the heart of a CQRS Event Sourced system. Which is why changing or upgrading them can be problematic. In this post I’m going to cover a few principles to bear in mind, which should help you avoid hitting the rocks. Before I dive into  ‘how to upgrade CQRS events’ I’m going to recap the role Events play in the system.

The Role of Events in CQRS ES

An event is simply a message in the form of a class describing a state change within your system. The class is usually immutable and made up of a series of public readonly fields. It contains a description of a state change and is stored in an event stream.

Each event is required in order to rebuild the current state of your aggregate root. They are also used to maintain a read model via projections. Put simply, events are the heart of a CQRS/ES system and should be handled with care! Given that context there are some simple rules to bear in mind when you need to change the definition of an event.

How To Upgrade CQRS Events

  1. Never rename a field
  2. Never delete a field
  3. Never rename an event message

So what can you do? You can add fields or create new messages, e.g. SomeEventV2 .

This is where it gets interesting – How should we handle old events that don’t have data in the new field? Is the new field intended to replace the use of an existing field? If so, what do we do with the existing field. Should our aggregate roots and de-normalisers still handle old events?

Handling Missing Fields

The most obvious solution is to use a sensible default or decide if missing data is in fact important. Another approach is to derive the value from the existing fields. But what happens when both of those alternatives are not possible? For that we need to examine when and how the information is needed. Events are raised in the domain as a result of a command. The event field values, are derived from the current state of the domain object and the command issued. So all new events raised will have the correct information. This allows you to upgrade the de-normalisers to expect the new fields to be populated correctly.

What About Re-Building Read Models

To re-build a read model you need to re-run the event stream. The event stream will contain old style events. As a result you will need to provide a way to upgrade the event, independently of the domain. A pragmatic solution is to use an extension method. On receipt of an old event upgrade it event via an extension method.

A note on performance: You may find that you are unable to derive the value without carrying out an expensive lookup or calculation. In most cases this process does not need to be super fast as it is only going to be used to re-build the read model. This is not part of the normal flow of the application and is unlikely to happen often. However, bear in mind the number of events that will need to be run and the duration of a single upgrade. If you have many events and or a slow upgrade process you may need to consider optimisations or over night runs. Given  this shouldn’t happen often, consider carefully if optimisation is really worth while. Don’t pre-emotively optimise the upgrader, it probably doesn’t need it.

Upgrading Domain Models

Given the content of an event is derived from the state of the domain model, upgrading events is not normally an issue within a model. In most cases all the information you need to derive the missing field is available to the model. However, there are times when this information is not available. This can be an indication the design of the aggregate is wrong however, you may find it is not practical to change that. An alternative approach is to provide an upgrading service to the model. As always, a domain model should be as independent as possible. On occasions however, it may need to be able to reach out for additional information. Rather than giving the model direct access to a data source, define an interface for a service within the model. This interface acts in a similar fashion to an Anti-Corruption Layer (ACL). This service is then implemented by the client who can use any data service required to source the information.

A note on performance: Given this is part of the normal flow of the application, it is worth monitoring for performance issues. In general, commands should ‘complete’ as fast possible. The command it’s self may not be called often and therefore may not be a good candidate for performance tuning.

Overtime you may end up with a proliferation of V numbered event names

A key value of running a system with explicit and well named events is the readability of the event stream. You know you have done a good job when on seeing the event stream, a domain expert would have a good idea what’s happening. To mitigate this problem consider using a more descriptive name rather than simply post fixing V2.

Conclusion

So to wrap things up we have seen the basic rules for changes to event messages them selves. Don’t change existing fields because they are needed each time the event stream is re-run to bring an aggregate up to date. Adding new fields is ok however an upgrade strategy needs to be in place. If you do not have the need to re-run any of the events in the future then you can skip upgraders for the read side. In most cases upgrading events within an aggregate is trivial. This is because you have local access to all the state needed to provide the missing information. When this is not the case, provide a service that will deliver the missing information. If practical, consider re-designing the aggregate. Finally, don’t pre-emptively  optimise the upgrade process. Rather consider volume, and individual upgrade time and how often it is likely to be needed.

Daniel

I'm a professional software engineer of near on 15 years. Lucky enough to work for a small but rapidly growing company in London called Redington. They have given me the technical freedom to learn some cutting edge technologies like CQRS and Event Sourcing. Now I'm sharing what I learn here.