This question keeps cropping up. How do you handle set based consistency validation in CQRS. In fact, one of my subscribers had this scenario:
“I have a user registration system with login/password. The login must be unique in the system. How do I implement such a requirement?”
Just in case you are not familiar with CQRS and ES, this problem arises because of something called ‘eventual consistency’.
What Is Eventual Consistency?
Eventual consistency is the time between the state of an application changing and that state being persisted. This time difference is explicit in a typical CQRS ES system because the ‘write’ database is separate from the ‘read side’ database.
In traditional applications, the problem is resolved by preventing a user from making changes to the system until all the data is saved. Think postback cycle. This works great at low scale but soon becomes very slow if the updates are complex or at high scale.
In a CQRS and ES application, on the other hand, commands (requests for state changes) are processed separately from the read model being updated. This can result in a delay between the state change completing and the read model being up to date.
The advantage is that updates are much simpler – on the write side they are just appended to the event stream. And on the read side, they are also very simple owing to the de-normalised structure of a typical read model.
They are also far more scaleable.
But eventual consistency also creates some challenges for the front end of a system. For more on that read this article 4 Ways to Handle Eventual Consistency on the UI
What Is The Actual Problem?
Because of the delay between the state change and the read model being updated, there is a chance that two people may register with your system with the same username.
The good news is there are a number of good strategies to solve this problem. But before I dive into them I need to cover a few points.
Do You Really Need To Implement a Registration System Using a CQRS ES Approach?
The reality is, most applications don’t need a custom registration system. There are plenty of off-the-shelf systems which work well and are properly maintained. Does this aspect of your application really need to use CQRS ES?
If not, don’t use it.
Always remember to refer to the business about these types of features. If it turns out that having a complete log of every account activity is critical – maybe you are in a regulated environment, then it may make sense. But don’t just use this approach as a blanket architecture for all aspects of your application.
Special Cases
It may be that your application uses email addresses as the username. In this case, the problem goes away – to a certain extent. The only possibility now is that 1 person ends up with two accounts. And the chances of this occurring are extremely unlikely. So much so that it very likely isn’t worth your developers time to resolve.
Approaches to Dealing with Set Based Validation
But what if we do have to deal with consistency. So given this kind of scenario, what options are there to resolve this kind of set based issue.
1. Locking, Transactions and Database Constraints
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
2. Hybrid Locking Field
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
3. Rely on the Eventually Consistent Read Model
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
4. Re-examine the Domain Model
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.
You can see a good example of this in WilliamVerdolini’s blog post on set validation here http://williamverdolini.github.io/2014/08/16/cqrses-set-validation/
Conclusion
In reality, every system is ‘eventually consistent’. It’s just in a CQRS ES approach it is explicit. Which is actually a good thing.
Here is another post with another point of view you may find interesting: http://codebetter.com/gregyoung/2010/08/12/eventual-consistency-and-set-validation/
I hope this post gives you food for thought. I’d love to hear about other approaches. Do leave a comment below.
tl;dr;
IMHO: Change the way you think when you really want CQRS and ES, the biggest issue is concurrency. Not the problem defined in the question: “Because of the delay between the state change and the read model being updated, there is a chance that two people may register with your system with the same username.”
I guess I don’t really understand the issue here. If you want to check if a user already exists, I think you’re getting CQRS wrong. The whole idea is the always appending log. With the idea to make views/ readmodels based on the logs.
For the soccer team with 1 goalkeeper, I would create a single aggregate “soccerTeam” with an action defineGoalKeeper(personId), and the soccer team should have some data on all id’s which are present as team member etc.
For the registration system, you can make your aggregate Id the login name. And defining an action, createNewLogin(username). Your domain will try to retrieve the aggregate with the Id, and if it exists, throw an exception. If you don’t want the aggregateId to be a string, generate an UUID based on the string and use that as your aggregateId.
Thanks André. You are right in that concurrency is an interesting topic. CQRS ES allows you far finer concurrency detection which reduces the potential problems. However, people do experience this eventual consistency issue, especially when they are not used CQRS and ES.
I’m not sure that just because you need to check uniqueness that “you’re getting CQRS wrong”. If it’s something the business needs then its something you have to handle.
Regarding the soccerTeam, I think that is what I suggested. Although you wouldn’t need the personId on the command defineGoalKeeper as you would be issuing the command against an already loaded domain object.
Technically you could make the login name their aggregate Id, but I’m not sure anyone would remember it or use it so it may not be very practical.
Sorry, you are right I didn’t read the last line on the soccer team part.
With the login name I mean you don’t need an Id as in a UUID. The login name which must be unique could be your Id. If you don’t want that, it is quite easy to generate an UUID which is unique based on the unique username. So your userAggegrate could have a registerAsNewUser command. It tries to find an existing userAggregate, if it finds one, raise the exception.
It’s basically the same as validating any other command, you only need to define a unique Id, which could be a username IMHO. In your read model you can always do some magic and generate a consistant unique UUID. In this case the aggegrateId is a string rather then an UUID. You only end up with the concurrency issue, not the readmodel being late.
It is actually the same for some case that a malfunctioning client is sending the same commands parallel, and the system is raising objectWithUUIDCreated twice. You in an ideal world the second request would have gotten an exception, aggregateAlreadyExists…
It’s actually quite easy to make a unit test for it, with just 1 event in the given region.
given(userRegistered(xyz);
when(registerUser(xyz);
thenFailWith(UserAlreadyExistsException);
Given that you can’t or at least shouldn’t query a domain model, the only way to check uniqueness is via the read model (unless you implement some of the strategies suggested in the article). If the read model is eventually consistent it may not be up to date when you check. Hence the possibility of a non-unique getting in. I hope that makes sense.