Thursday, March 8, 2012

Aggregate [DDD] - boosting the performance.

In this post I would like to tell you about some experiment I've made - put to the test the Aggregate [DDD] pattern.

From Eric Evan's book:
It is difficult to guarantee the consistency of changes to objects in a model with complex associations. Invariants need to be maintained that apply to closely related groups of objects, not just discrete objects. Yet cautious locking schemes cause multiple users to interfere pointlessly with each other and make a system unusable. [DDD, p. 126] 

According to this, the Aggregate pattern should provide a more efficient way to enforce invariants in a multiple users environment by significantly reducing DB locks.
OK, so I've put this to the test.

I've simulated an eCommerce store with 80,000 concurrent users which try to add/edit different OrderLines of 1000 different Orders. One or more users can work simultaneously on the same Order.
Invariant: each Order has a MaximumTotal that cannot be exceeded by the sum of the Amount of all of its OrderLines. 
I've used SQL Server 2005 + NHibernate 3.1.0.

So first I tried to enforce this invariant without the Aggregate pattern. I've created an OrderLineService with 2 methods:

This method gets an orderId and an amount. It fetches the Order eagerly with its OrderLines, find the first OrderLine and tries to update its Amount with the amount passed as parameter. Before updating the amount, we must check that the invariant is not going to be violated, so we calculate the sum of all of the OrderLines of the given Order. 
But what if right after we found out that the invariant is not going to be violated and right before committing the changes - a second concurrent user has added a new OrderLine that might violate the invariant?
For example:
Order #123 has a MaxTotal of 100$. It has 2 OrderLines with 40$ each.
Two requests (r1 and r2) are arriving to the server simultaneously - r1 wants to update the first OrderLine to 50$ and r2 wants to add a new OrderLine with an amount of 20$. If both will succeed the invariant will be violated.
r1 fetches Order #123. 
r2 fetches Order #123. 
r1 checks the invariant, find it to be ok and updates the amount (but yet to commit).
r2 checks the invariant, find it to be ok and adds a new OrderLine.
r2 commits.
Now in the db we have 3 OrderLines to Order #123: 40$, 40$, 20$ - invariant is kept. 
r1 commits.
Now in the db we have 3 OrderLines to Order #123: 50$, 40$, 20$ - invariant is violated and we don't even know about it.

To prevent this we must lock the table with RepeatableRead isolation (in MSSQL 2005 this isolation level also prevents the phantom reads). This means that until that transaction is not committed - NO OTHER USER CAN INSERT A NEW ORDERLINE, NOT EVEN IF THIS NEW ORDERLINE BELONGS TO ANOTHER ORDER!

The next Method:


This method gets an orderId and an amount. It fetches the Order eagerly with its OrderLines and tries to add a new OrderLine with the amount passed as a parameter. Before adding the new OrderLine, we must check that the invariant is not going to be violated, so we calculate the sum of all of the OrderLines of the given Order. 
Again, for the same reason as stated above, we must lock the table with RepeatableRead isolation.

Here is the code that simulates 80,000 concurrent users, all are trying to add/edit different OrderLines with different amounts.

By not using the Aggregate pattern it took ~200,000 milliseconds to process 80,000 concurrent requests.

Now let's simulate a scenario in which we do use the Aggregate pattern.
First I've added two new methods to class Order: UpdateFirstOrderLine and AddOrderLine.

Next I've also add 2 new methods to OrderLineService: UpdateFirstOrderLineWithAggr and InsertNewOrderLineWithAggr:

As you can see, in these 2 methods I haven't used the RepeatableRead isolation level for none of the transactions, meaning 2 concurrent users can simultaneously add 2 different OrderLines to the same Order and potentially violate the invariant. So how can we tolerate this? 
Let's look at the pattern's definition again:

Choose one Entity to be the root of each Aggregate, and control all access to the objects inside the boundary through the root.

Any changes to Order.OrderLines list, whether if it's adding a new OrderLine or modifying an existing one, will be done through the root (Order) and will increase the root's Version (optimistic-lock). Therefore, if two concurrent users will try to add two different OrderLines to the same Order - one will succeed and the other will fail for trying to update an out-of-date instance of Order.
With this mechanism, I don't have to lock the whole OrderLines table any more - i can prevent simultaneous modifications through the root object.

Let's simulate this.
Order #123 has a MaxTotal of 100$ and Version 2. It has 2 OrderLines with 40$ each.
Two requests (r1 and r2) are arriving to the server simultaneously - r1 wants to update the first OrderLine to 50$ and r2 wants to add a new OrderLine with an amount of 20$. If both will succeed the invariant will be violated.
r1 fetches Order #123. Version is 2
r2 fetches Order #123. Version is 2
r1 checks the invariant, find it to be ok and updates the amount (but yet to commit).
r2 checks the invariant, find it to be ok and adds a new OrderLine.
r2 commits.
Now in the db we have 3 OrderLines to Order #123: 40$, 40$, 20$ - invariant is kept. Also, the Version is now 3.
r1 tries to commit but with Version 2 and fails.

For that mechanism to work properly, we need to be 100% sure that any changes to one of the Aggregate's members will increase the root's version. As Evan says:

Because the root controls access, it cannot be blindsided by changes to the internals  (see also)

Here is the code that simulates 80,000 concurrent users, all are trying to add/edit different OrderLines with different amounts:


By using the Aggregate pattern it took ~54,000 milliseconds to process 80,000 concurrent requests.

Around 4 times faster!!!

All source files can be found here

No comments:

Post a Comment