This question keeps cropping up. How do you handle set based consistency validation in CQRS. In fact, one of my subscribers had this scenario:
“I have a user registration system with login/password. The login must be unique in the system. How do I implement such a requirement?”
Just in case you are not familiar with CQRS and ES, this problem arises because of something called ‘eventual consistency’.
Eventual consistency is the time between the state of an application changing and that state being persisted. This time difference is explicit in a typical CQRS ES system because the ‘write’ database is separate from the ‘read side’ database.
In traditional applications, the problem is resolved by preventing a user from making changes to the system until all the data is saved. Think postback cycle. This works great at low scale but soon becomes very slow if the updates are complex or at high scale.
In a CQRS and ES application, on the other hand, commands (requests for state changes) are processed separately from the read model being updated. This can result in a delay between the state change completing and the read model being up to date.
The advantage is that updates are much simpler – on the write side they are just appended to the event stream. And on the read side, they are also very simple owing to the de-normalised structure of a typical read model.
They are also far more scaleable.
But eventual consistency also creates some challenges for the front end of a system. For more on that read this article 4 Ways to Handle Eventual Consistency on the UI
Because of the delay between the state change and the read model being updated, there is a chance that two people may register with your system with the same username.
The good news is there are a number of good strategies to solve this problem. But before I dive into them I need to cover a few points.
The reality is, most applications don’t need a custom registration system. There are plenty of off-the-shelf systems which work well and are properly maintained. Does this aspect of your application really need to use CQRS ES?
If not, don’t use it.
Always remember to refer to the business about these types of features. If it turns out that having a complete log of every account activity is critical – maybe you are in a regulated environment, then it may make sense. But don’t just use this approach as a blanket architecture for all aspects of your application.
It may be that your application uses email addresses as the username. In this case, the problem goes away – to a certain extent. The only possibility now is that 1 person ends up with two accounts. And the chances of this occurring are extremely unlikely. So much so that it very likely isn’t worth your developers time to resolve.
But what if we do have to deal with consistency. So given this kind of scenario, what options are there to resolve this kind of set based issue.
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.
You can see a good example of this in WilliamVerdolini’s blog post on set validation here http://williamverdolini.github.io/2014/08/16/cqrses-set-validation/
In reality, every system is ‘eventually consistent’. It’s just in a CQRS ES approach it is explicit. Which is actually a good thing.
Here is another post with another point of view you may find interesting: http://codebetter.com/gregyoung/2010/08/12/eventual-consistency-and-set-validation/
I hope this post gives you food for thought. I’d love to hear about other approaches. Do leave a comment below.
I'm a professional software engineer of near on 15 years. Lucky enough to work for a small but rapidly growing company in London called Redington. They have given me the technical freedom to learn some cutting edge technologies like CQRS and Event Sourcing. Now I'm sharing what I learn here.