CriticalSections

Paul M Watt

We just found a bug related to critical sections. We develop our application on mainly Windows CE as well as Windows. Our windows CE application is pretty solid and is continually updated an tested. However our Windows version updates only once a year. We rebuilt the Windows version and put it through a retest. However, it was displaying deadlocking issues. The main clue we had each time it deadlocked was the blocking that occured around the critical section of one particular object. We went through the code, and every lock and an unlock. There were no early exit points from the functions that locked the critical section, and the largest worry was the fact that it works on CE. It turns out, that we had an extra call to ::ExitCriticalSection in one of the functions. This function does not perform any checks in the Windows version, and will blindly decrement the lock count. The critical sections lock var is -1 in the unlock state, and increments each time the particular thread attempts to reenter the critical section. If we exited the critical section one too many times, the lock var decrements to -2. The next time any thread, including the thread that unlocked the critical section tries to recapture the lock, it will block as well because it is not equal to -1. CEs critical section structure is different. It never decrements further after the locked thread has been released. Therefore the error was not present on CE. Here is some example code that displays the problem: CRITICAL_SECTION cs; ::InitializeCriticalSection(&cs); ::EnterCriticalSection(&cs); ::EnterCriticalSection(&cs); ::EnterCriticalSection(&cs); ::LeaveCriticalSection(&cs); ::LeaveCriticalSection(&cs); ::LeaveCriticalSection(&cs); // This next call is one too many. ::LeaveCriticalSection(&cs); // This call will cause the thread to block in windows. ::EnterCriticalSection(&cs); ::DeleteCriticalSection(&cs);

Pete OHanlon

This is why we always put monitoring code around any threading to identify issues (normally where locks are acquired and not released). I hate reference counting - it's just such a hack. GRRR.

the last thing I want to see is some pasty-faced geek with skin so pale that it's almost translucent trying to bump parts with a partner - John Simmons / outlaw programmer
Deja View - the feeling that you've seen this post before.

James R Twine

Yet another reason to wrap access to things like this in a simple class - you can prevent over-releasing a resource, and in the case of C++ exceptions, the resource will get released. Peace!

-=- James

If you think it costs a lot to do it right, just wait until you find out how much it costs to do it wrong!
Avoid driving a vehicle taller than you and remember that Professional Driver on Closed Course does not mean your Dumb Ass on a Public Road!
DeleteFXPFiles & CheckFavorites (Please rate this post!)

Paul M Watt

I am using Atl's CComCriticalSection, which does not appear to prevent over releasing from happening. Time to create a new solution I suppose.

Stephen Hewitt

Use the application verifier. Here's how I'd go about looking for CRITICAL_SECTION problems such as your's: 1. Download the latest version of WinDBG[^]. Every Win32 developer should get to know this debugger and its associated tools. 2. Select "Start"->"All Programs"->"Debugging Tools for Windows"->"Global Flags". 3. Select "Image File" tab. 4. In the "Image: (TAB to refresh)" edit box enter the name of your EXE with extension (not the full path, just the filename). 5. Press TAB. 6. Tick "Enable application verifier" checkbox. 7. Press "Apply". 8. Run the application in the debugger of your choice (I'd use WinDBG with a symbol server). <BANG> 9. When done un-tick all the options and press "OK". It is important to note that without this last step the verifier will be on for all EXEs named as entered in step 4 forever; even when the "Global Flags" application is closed. When you over release a breakpoint will be generated and you'll see something like the following in the debug window:

===========================================================
VERIFIER STOP 00000209: pid 0x12D0: critical section over-released or corrupted

0012FF18 : Critical section address
FFFFFFFF : Lock count
00000000 : Expected minimum lock count
7C97C7C0 : Critical section debug info address

===========================================================

When the breakpoint is hit examine the call stack; it will take you straight to the source of the problem. This is only scratching the surface of the magical powers the Application Verifier possesses. You should run your application through it regularly.

Steve

Tim Smith

If you don't want reference counting, then how do you handle the problem of the same thread locking the same lock twice. In small applications, that isn't much of a problem, but in larger applications, being able to lock the same lock twice can reduce program complexity.

Tim Smith I'm going to patent thought. I have yet to see any prior art.

Pete OHanlon

Tim Smith wrote:

If you don't want reference counting, then how do you handle the problem of the same thread locking the same lock twice

I didn't say don't use it. I just said I hated having to do it (I also hate having to put XML Code comments into C# to document it, but I still do it). I've been using reference counting since I started COM programming, and it always had to be handled ever so carefully.

the last thing I want to see is some pasty-faced geek with skin so pale that it's almost translucent trying to bump parts with a partner - John Simmons / outlaw programmer
Deja View - the feeling that you've seen this post before.

Pete OHanlon

That's interesting. I didn't know that it did that. A very cool feature.

the last thing I want to see is some pasty-faced geek with skin so pale that it's almost translucent trying to bump parts with a partner - John Simmons / outlaw programmer
Deja View - the feeling that you've seen this post before.

Paul M Watt

Cool thanks, I will check this out.

Andy Brummer

What James is talking about is something like CComAutoCriticalSection. As long as you don't to strange things like calling the destructor explicitly you won't run into this issue because management of the resource is tied to the lifetime of the object which is managed by the compiler. [edit] Just read the docs a little more closely. He is talking about a class like this:

class lock
{
public:
lock(CComCriticalSection& section) : m_section(section)
{
m_section.Lock();
};

~lock()
{
m_section.Release();
};

CComCriticalSection& m_section;
}

[/edit]

Using the GridView is like trying to explain to someone else how to move a third person's hands in order to tie your shoelaces for you. -Chris Maunder

NormDroid

Thats why C++ was invented.

We made the buttons on the screen look so good you'll want to lick them. Steve Jobs

peterchen

Two classes, actually, since you have to monitor two resources. CCriticalSection - holds the CRITICAL_SECTION resource, but does NOT expose Enter/Leave CLockCriticalSection - holds the acquisition of the critical section (boost sync objects got it right. But on a lazy day, I just use scope guards)

Developers, Developers, Developers, Developers, Developers, Developers, Velopers, Develprs, Developers!
We are a big screwed up dysfunctional psychotic happy family - some more screwed up, others more happy, but everybody's psychotic joint venture definition of CP
Linkify!|Fold With Us!

Lost User

I had a play with the application verifier, but now I can't disable it and am getting some VERIFIER STOP messages deep in some MFC code which I really want to ignore. I cleared the Image name from the "Image File" page but this isn't making any difference. Every single checkbox on every page of Global Flags is off. Any ideas?

Lost User

Found it in the registry under HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options.