{ progmatica }: 2012

Wednesday, June 6, 2012

Asynchronous Programming in .Net 4.5

In this post i would like to talk about .net 4.5 and the enhancement made there for the Asynchronous Programming.

Introduction
Threads are very expensive - they consume memory (1Mb per thread) and they consume time in their initialization, finalization and context switching. Therefore, threads should be managed carefully by the thread pool. The goal is to create threads no more than needed.

When a thread is executing an I/O operation such as networking, file system, etc., the thread is being blocked by Windows while the hardware device is performing the I/O operation. The thread will continue to run when the device will finish its operation. So far so good since a waiting thread should not waste a precious CPU time.
But there is a problem though - a blocked thread does not return to the thread pool and thus forcing the thread pool to create new threads for incoming requests, or even worse - reject incoming requests.
Not only I/O operations block threads. For example: SqlConnection.Open can block a thread until it will have an available connection to supply.

Consider the following piece of code:

figure 1

Here we have 4 blocking operations which are highlighted in yellow.
This code is quite problematic for a scalable server. Imagine a server that serves tens, hundreds or even thousands of concurrent requests - this means that while waiting for these operations to finish and thus finally release the thread, new requests are keep coming in and since threads are blocked and are not being released back to the thread pool, the thread pool will produce more and more threads in order to handle the incoming requests. At best, the thread pool will manage to handle all the incoming requests by producing more and more threads which will eventually decrease the performance dramatically (as stated above). At worst, the number of threads will reach the limit of the thread pool and thus, incoming requests will be queued.

Asynchronous Programming
Previous versions of .net prior to 4.5 already had a solution (not completed) to the problem stated above.
The solution was in the form of Asynchronous Programming which includes all the Begin_xxx and End_xxx methods. For example: SqlCommand.BeginExecuteReader, WebRequest.BeginGetResponse and so forth.

Below, is an example code:

figure 2

Async Programming in general is not the objective of this post so I'm not going to explain in details how SqlCommand.BeginExecuteReader works. But i will say that when the thread is invoking BeginExecuteReader it is not blocked and it is free to continue to the next line of execution. When the SqlCommand finishes to execute the query against the DB, the inline method supplied to BeginExecuteReader will be invoked by some available thread, not necessarily the one who called BeginExecuteReader. And thus, no threads are being blocked like they were if they were invoking SqlCommand.ExecuteReader.

As mentioned, this code belongs to .net versions prior to 4.5 and it has some major drawbacks:
1 - using statements can not be used.
2 - Inline methods are not so intuitive.
3 - there are no async executions for DataReader.GetInt32, DataReader.Read and SqlConnection.Open. Therefore, blocking operations are not fully avoided.

Actually, SqlConnection.Open can be a real bottleneck. I made an experiment: I have created an asp.net application with 2 pages: light.aspx and heavy.aspx. The light one should perform a quick and simple operation, say, some simple calculation. The heavy one should perform a heavy and long operation, say, executing some heavy query against the db which might take a few seconds.
The heavy page will use a connection pool (not a thread pool!) of 30 connections.
I have implemented the heavy page in 2 versions: a synchronous version which will implement the code from figure 1 and an asynchronous version which will implement the code from figure 2.
For both versions, a simple client application that i've built, sent 1000 requests for the heavy page and then 1000 requests for the light page.
What I'm interested to see in such an experiment is how the light pages are responding when there are many requests for heavy pages that are consuming threads from the thread pool.

I've expected that the results of the asynchronous version will be better since less threads will be blocked by the heavy page and thus less threads will be created by the thread pool.
I was wrong - at both versions the server became irresponsive at some point for both light and heavy pages.
I expected this for the synchronous version, but why did it also happened for the asynchronous version?
The answer is this: the first 30 requests were not blocked since SqlConnection.Open had available connections to supply them. But from the 31th request and forth, SqlConnection.Open blocked all threads until some of the first 30 threads will finish their job and release their connections. Thus, more and more threads became blocked, hence, increasing the load on the thread pool. At some point, new incoming requests, whether they were for heavy pages or light pages could not be handled and thus where queued.

Now we'll see how .net 4.5 can help us solve this problem.

.Net 4.5 - Asynchronous Programming
In the code below you can see the new way to implement async operations with .net 4.5:

figure 3

The first thing to notice about is two new keywords: await and async. To support these 2 new keywords you have 2 options: upgrade VS 2010 by installing this and then this, or you can start working with higher versions of VS: 2011 or 2012.
But since we are using features of the async ADO.Net which is part of the .net 4.5 - we cannot use VS 2010 which doesn't support them anyway (as far as i know).

OK, now let's analyze it.
In figure 4 you can see the control flow of the thread that will execute the method shown in figure 3. There you can see that thread t1 is the executing thread and it is the one who calls SqlConnect from within Foo.
When the thread is executing the line: await con.OpenAsync(); ado.net is trying to allocate a free and available connection in the connection pool. If all connections are taken (and the limit is reached), the thread will skip all the code below that line and will return back to the point in which it entered SqlConnect and will continue to execute the Foo method. The code below await con.OpenAsync(); will be executed when some connection will become available and it will be executed by a thread which most likely won't be t1 (t2 in figure 4).

figure 4

Of course, the same goes for all the other lines involving the await keyword, meaning that when the new thread (t2), which will execute the rest of the code, will reach a line 99 in figure 3, it will skip all the lines below it and return back to the thread pool.

This ensures us that SqlConnect does not involve any blocking points and thus, no thread will be blocked by SqlConnect and this will increase the overall availability of the thread pool's threads.

I went back to my experiment and changed the heavy page to implement the code from figure 3.
Just to remind you, what I'm interested to see in such an experiment is how the light pages are responding when there are many requests for heavy pages that are consuming threads from the thread pool.
I run my test again and... good news! for the new version, the responsiveness of the server for light pages was the same whether heavy pages were running in the background or whether not. It means that the heavy pages did not add any significant load on the thread pool.
By monitoring the thread pool this came up to be true - the thread pool hardly needed new threads to handle the requests for both heavy and light pages.

Thursday, March 8, 2012

Aggregate [DDD] - boosting the performance.

In this post I would like to tell you about some experiment I've made - put to the test the Aggregate [DDD] pattern.

From Eric Evan's book:
It is difficult to guarantee the consistency of changes to objects in a model with complex associations. Invariants need to be maintained that apply to closely related groups of objects, not just discrete objects. Yet cautious locking schemes cause multiple users to interfere pointlessly with each other and make a system unusable. [DDD, p. 126]

According to this, the Aggregate pattern should provide a more efficient way to enforce invariants in a multiple users environment by significantly reducing DB locks.
OK, so I've put this to the test.

I've simulated an eCommerce store with 80,000 concurrent users which try to add/edit different OrderLines of 1000 different Orders. One or more users can work simultaneously on the same Order.
Invariant: each Order has a MaximumTotal that cannot be exceeded by the sum of the Amount of all of its OrderLines.
I've used SQL Server 2005 + NHibernate 3.1.0.

So first I tried to enforce this invariant without the Aggregate pattern. I've created an OrderLineService with 2 methods:

This method gets an orderId and an amount. It fetches the Order eagerly with its OrderLines, find the first OrderLine and tries to update its Amount with the amount passed as parameter. Before updating the amount, we must check that the invariant is not going to be violated, so we calculate the sum of all of the OrderLines of the given Order.

But what if right after we found out that the invariant is not going to be violated and right before committing the changes - a second concurrent user has added a new OrderLine that might violate the invariant?
For example:

Order #123 has a MaxTotal of 100$. It has 2 OrderLines with 40$ each.

Two requests (r1 and r2) are arriving to the server simultaneously - r1 wants to update the first OrderLine to 50$ and r2 wants to add a new OrderLine with an amount of 20$. If both will succeed the invariant will be violated.

r1 fetches Order #123.

r2 fetches Order #123.

r1 checks the invariant, find it to be ok and updates the amount (but yet to commit).

r2 checks the invariant, find it to be ok and adds a new OrderLine.

r2 commits.

Now in the db we have 3 OrderLines to Order #123: 40$, 40$, 20$ - invariant is kept.

r1 commits.

Now in the db we have 3 OrderLines to Order #123: 50$, 40$, 20$ - invariant is violated and we don't even know about it.

To prevent this we must lock the table with RepeatableRead isolation (in MSSQL 2005 this isolation level also prevents the phantom reads). This means that until that transaction is not committed - NO OTHER USER CAN INSERT A NEW ORDERLINE, NOT EVEN IF THIS NEW ORDERLINE BELONGS TO ANOTHER ORDER!

The next Method:

This method gets an orderId and an amount. It fetches the Order eagerly with its OrderLines and tries to add a new OrderLine with the amount passed as a parameter. Before adding the new OrderLine, we must check that the invariant is not going to be violated, so we calculate the sum of all of the OrderLines of the given Order.

Again, for the same reason as stated above, we must lock the table with RepeatableRead isolation.

Here is the code that simulates 80,000 concurrent users, all are trying to add/edit different OrderLines with different amounts.

By not using the Aggregate pattern it took ~200,000 milliseconds to process 80,000 concurrent requests.

Now let's simulate a scenario in which we do use the Aggregate pattern.
First I've added two new methods to class Order: UpdateFirstOrderLine and AddOrderLine.

Next I've also add 2 new methods to OrderLineService: UpdateFirstOrderLineWithAggr and InsertNewOrderLineWithAggr:

As you can see, in these 2 methods I haven't used the RepeatableRead isolation level for none of the transactions, meaning 2 concurrent users can simultaneously add 2 different OrderLines to the same Order and potentially violate the invariant. So how can we tolerate this?
Let's look at the pattern's definition again:

Choose one Entity to be the root of each Aggregate, and control all access to the objects inside the boundary through the root.

Any changes to Order.OrderLines list, whether if it's adding a new OrderLine or modifying an existing one, will be done through the root (Order) and will increase the root's Version (optimistic-lock). Therefore, if two concurrent users will try to add two different OrderLines to the same Order - one will succeed and the other will fail for trying to update an out-of-date instance of Order.

With this mechanism, I don't have to lock the whole OrderLines table any more - i can prevent simultaneous modifications through the root object.

Let's simulate this.
Order #123 has a MaxTotal of 100$ and Version 2. It has 2 OrderLines with 40$ each.
Two requests (r1 and r2) are arriving to the server simultaneously - r1 wants to update the first OrderLine to 50$ and r2 wants to add a new OrderLine with an amount of 20$. If both will succeed the invariant will be violated.
r1 fetches Order #123. Version is 2
r2 fetches Order #123. Version is 2
r1 checks the invariant, find it to be ok and updates the amount (but yet to commit).
r2 checks the invariant, find it to be ok and adds a new OrderLine.
r2 commits.
Now in the db we have 3 OrderLines to Order #123: 40$, 40$, 20$ - invariant is kept. Also, the Version is now 3.
r1 tries to commit but with Version 2 and fails.

For that mechanism to work properly, we need to be 100% sure that any changes to one of the Aggregate's members will increase the root's version. As Evan says:

Because the root controls access, it cannot be blindsided by changes to the internals (see also)

Here is the code that simulates 80,000 concurrent users, all are trying to add/edit different OrderLines with different amounts:

By using the Aggregate pattern it took ~54,000 milliseconds to process 80,000 concurrent requests.

Around 4 times faster!!!

All source files can be found here

Tuesday, January 17, 2012

hbm2net - c# instead of T4

I'm still mapping my entities to NHibernate by using hbm files. I still do this for several reasons, but i won't detail them now.

Since I'm using hbm files, I want to exploit one of their huge advantage - auto generating lot's of code that can be derived from the hbm files by using the hbm2net tool.

for example:

- Auto generate my POCO entities.

- Auto generate my DTO entities.

- Auto generate my server side validations and their equivalent client side validations

- Implementing cross cutting behaviors like overriding GetHashCode(), Equals(), or invoking NotifyPropertyChanged etc.

In it's earliest versions, hbm2net was expecting a Velocity script to auto generate code. Recent versions can also work with T4 scripts.

hbm2net is great and it's hard to imagine how to work with out it. Unfortunately, Velocity is not that user-friendly and neither T4.

My preferred way is to implement my own generator written in c#, which can be plugged into hbm2net instead of working with T4 or Velocity.

And why would i prefer c#...? Well, i guess that's obvious...

So, lets get to work.

First of all, download the latest version of hbm2net from here and extract it to wherever you like.

Next, create a Class Library project in Visual Studio and call it MyHbm2NetGenerator.

Add a reference to NHibernate.Tool.hbm2net.dll (should be located where you've extracted the zip file).

Add a class to this project and call it POCOGenerator. This class should be derived from NHibernate.Tool.hbm2net.AbstractRenderer and implement NHibernate.Tool.hbm2net.ICanProvideStream:

the hbm2net will create a single instance of this class and will use it to generate the derived code for all hbm files.

Next, implement ICanProvideStream.CheckIfSourceIsNewer:

The hbm2net will invoke CheckIfSourceIsNewer for each hbm file. The source parameter will be the LastWriteTimeUtc of the current hbm file. The directory parameter will be the path for the output directory in which the generated files will be stored. This method should return true if source is greater than the LastWriteTimeUtc of the generated file, meaning - if there were changes in the hbm file since the last generation of the POCO file.

The method GetFileName is receiving the parameter clazz which is holding almost all the details about the POCO entity that is going to be generated. I'll give more details about this class soon, but for now, all we need in this method is the POCO entity name which can be found at clazz.GeneratedName.

Next, implement ICanProvideStream.GetStream:

The hbm2net will invoke this method to get a stream to flush the content of the current generated POCO entity.

Next, you need to override the method Render. This is actually the main method where you generate the content of the POCO entity and flush it (save it).

Now, implement a method that will generate the POCO's content (GeneratePOCO is the name i gave it).
Of course you should be using the ClassMapping object to get all the POCO's details e.g. class name, class modifiers, base class, properties, fields etc.
In the next post i will show you in more details what can be done with ClassMapping in order to generate the desired POCO content.

OK, we're getting there: compile your MyHbm2NetGenerator project and then copy MyHbm2NetGenerator.dll to the directory where you've extracted hbm2net.

Next, create an xml file, call it config.xml (or whatever...) and put it wherever you like. config.xml should look like this:

renderer is the FullyQualifiedName of your POCOGenerator class. package is the namespace for your POCO entities - you will receive it in the POCOGenerator.Render as the savedToPackage parameter.

Now, execute the following command in the command shell:

<hbm2net dir>\hbm2net.exe --config=<config dir>\config.xml --output=<output dir> <hbm files dir>\*.hbm.xml

And that's it! Go to <output dir> to see your generated files.

To make hbm2net auto generate your code on every build of your domain/DTO/validations project, you can make a pre/post build event in your project settings with the command line I just showed you.

download code example