If you've written any high performance asynchronous applications you may have found yourself putting up Memory Barriers to avoid those costly thread locks. More commonly known as Memory Fences, this technique can allow you to create lock-free read-modify-write operations which can give you that little extra bit of performance that some applications need. To be fair this is an advanced technique so if you are embarking on this sort of thing be very careful and do lots of testing.
The problem with using lock-free approaches is that you now have to deal with Atomicity on your own and this may not always be as straightforward as you might think. Lets look at a couple simple examples of what is and is not Atomic.
Single reads and writes are always atomic:
Int32 number = 123; /* single writes are atomic */
number.ToString(); /*single reads are atomic */
Unary operations are another story since they consist of two operations (one read and one write). As a result, using them in a lock-free environment can lead to unexpected results:
number++;
number += 2;
This means that another thread or process could alter the value of 'number' in between its read and write. When this happens you will certainly be left scratching your head wondering how 1+= 1 could possibly equal 1294.
64 Bit types present an even more subtle problem. Single reads or writes of 64 Bit types 'may' not be atomic the same way their 32 Bit counterparts are. Thats right, I said 'may' not... Here's the issue. To store a 64 Bit value in a 32 Bit environment the runtime needs two separate memory locations and consequently there are two separate instructions to read or write a value. This creates a vulnerability similar to unary operations called a torn-read. When this occurs one of the memory locations reflects a value from one thread or process while the other reflects a value from another thread or process. In the end, this will cause you no end of pain and is next to impossible to debug.
Thread or Process A writes 5,000,000,000 which is stored in two 32 bit registers as follows:
10101010000001011111001000000000 -- 10101010000001011111001000000000
Thread or Process B writes 10,000,000,000 which is stored in the same two 32 bit registers as follows:
11010100000010111110010000000000 -- 11010100000010111110010000000000
A read by either could result in the following combination:
10101010000001011111001000000000 -- 11010100000010111110010000000000
In 64 Bit environments this is not an issue because they only need one memory location and can read and write with only one instruction.
So how can you use a lock-free pattern and avoid this problem on 32-bit platforms? There are a couple ways to deal with this, but really only one worth mentioning... Interlocked.
Interlocked.Increment(ref number);
This class provides a number of methods to help ensure interaction with types such as these are done atomically.
So what is the moral of this story? If you are going lock-free... Test, test, test...
Tags: threading, performance