开发者

Portable thread-safe lazy singleton

Greetings to all.

I'm trying to write a thread safe lazy singleton for future use. Here's the best I could come up with. Can anyone spot any problems with it? The key assumption is that static initialization occurs in a single thread before dynamic initialisations. (this will be used for a commercial project and company is not using boost :(, life would be a breeze otherwise :)

PS: Haven't check that this compiles yet, my apologies.

/*

There are two difficulties when implementing the singleton pattern:

Problem (a):  The "global variable instantiation fiasco". TODO: URL
This is due to the unspecified order in which global variables are initialised. Static class members are equivalent
to a global variable in C++ during initialisation.

Problem (b):  Multi-threading.
Care must be taken to ensure that the mutex initialisation is handled properly with respect to problem (a).

*/


/*
Things achieved, maybe:

*) Portable

*) Lazy creation.

*) Safe from unspecified order of global variable initialisation.

*) Thread-safe.

*) Mutex is properly initialise when invoked during global variable intialisation:

*) Effectively lock free in instance().


*/



/************************************************************************************

Platform dependent mutex implementation

*/
class Mutex
{
public:
 void lock();
 void unlock();
};



/************************************************************************************

Threadsafe singleton 

*/
class Singleton
{
public:  // Interface
 static Singleton* Instance();


private:  // Static helper functions

 static Mutex* getMutex();


private:  // Static members

 static Singleton* _pInstance;

 static Mutex* _pMutex;


private:  // Instance members

 bool* _pInstanceCreated;  // This is here to convince myself that the compiler is not re-ordering instructions.


private:  // Singletons can't be coppied

 explicit Singleton();
 ~Singleton() { }
};


/************************************************************************************

We can't use a static class member variable to initialised the mutex due to the unspecified
order of initialisation of global variables.

Calling this from 

*/
Mutex* Singleton::getMutex()
{
 static Mutex* pMutex = 0;  // alternatively:  static Mutex* pMutex = new Mutex();
 if( !pMutex )
 { 
  pMutex = new Mutex();  // Constructor initialises the mutex: eg. pthread_mutex_init( ... )
 }

 return pMutex;
}


/************************************************************************************

This static member variable ensures that we call Singleton::getMutex() at least once before
the main entry p开发者_StackOverflow社区oint of the program so that the mutex is always initialised before any threads
are created.

*/
Mutex* Singleton::_pMutex = Singleton::getMutex();


/************************************************************************************
Keep track of the singleton object for possible deletion.

*/
Singleton* Singleton::_pInstance = Singleton::Instance();


/************************************************************************************
Read the comments in Singleton::Instance().

*/
Singleton::Singleton( bool* pInstanceCreated )
{
 fprintf( stderr, "Constructor\n" );

 _pInstanceCreated = pInstanceCreated; 
}


/************************************************************************************
Read the comments in Singleton::Instance().

*/
void Singleton::setInstanceCreated()
{
 _pInstanceCreated = true;
}


/************************************************************************************

Fingers crossed.

*/
Singleton* Singleton::Instance()
{
 /*

 'instance' is initialised to zero the first time control flows over it. So
 avoids the unspecified order of global variable initialisation problem.

 */ 
 static Singleton* instance = 0;


 /*
 When we do:

  instance = new Singleton( instanceCreated );

 the compiler can reorder instructions and any way it wants as long
 as the observed behaviour is consistent to that of a single threaded environment ( assuming
 that no thread-safe compiler flags are specified). The following is thus not threadsafe:

 if( !instance )
 {
  lock();
  if( !instance )
  {
   instance = new Singleton( instanceCreated );
  }
  lock();
 }

 Instead we use:

  static bool instanceCreated = false;

 as the initialisation indicator.
 */
 static bool instanceCreated = false;


 /*

 Double check pattern with a slight swist.

 */
 if( !instanceCreated )
 {
  getMutex()->lock();
  if( !instanceCreated )
  {
   /*
   The ctor keeps a persistent reference to 'instanceCreated'.

   In order to convince our-selves of the correct order of initialisation (I think
   this is quite unecessary
   */
   instance = new Singleton( instanceCreated );

   /*
   Set the reference to 'instanceCreated' to true.

   Note that since setInstanceCreated() actually uses the non-static
   member variable: '_pInstanceCreated', I can't see the compiler taking the
   liberty to call Singleton's ctor AFTER the following call. (I don't know
   much about compiler optimisation, but I doubt that it will break up the ctor into
   two functions and call one part of it before the following call and the other part after.
   */
   instance->setInstanceCreated();

   /*
   The double check pattern should now work.
   */
  }  
  getMutex()->unlock();
 }

 return instance;
}


No, this will not work. It is broken.

The problem has little/nothing to do with the compiler. It has to do with the order in which a second CPU will 'see' what the first CPU has done to memory. The memory (and caches) will be consistent, but the timing of WHEN each CPU decides to write or read each part of memory/cache is indeterminate.

So for CPU1:

instance = new Singleton( instanceCreated );
instance->setInstanceCreated();

Let's consider the compiler first. There is NO reason why the compiler doesn't reorder or otherwise alter these functions. Maybe like:

temp_register = new Singleton(instanceCreated);
temp_register->setInstanceCreated();
instance = temp_register;

or many other possibilities - like you said as long as single-threaded observed behaviour is consistent. This DOES include things like " break up the ctor into two functions and call one part of it before the following call and the other part after."

Now, it probably wouldn't break it up into 2 calls, but it would INLINE the ctor, particularly since it is so small. Then, once inlined, everything may be reordered, as if the ctor was broken in 2, for example.

In general, I would say not only is it possible that the compiler reordered things, it is probable - ie for the code you have, there is probably a reordering (once inlined, and inlining is likely) that is 'better' than the order given by the C++ code.

But let's leave that aside, and try to understand the real issues of double-checked locking. So, let's just assume the compiler didn't reorder anything. What about the CPU? Or more importantly CPUs - plural.

The first CPU, 'CPU1' needs to follow the instructions given by the compiler, in particular, it needs to write to memory the things it has been told to write:

  • instance,
  • instanceCreated
  • other member variable of the Singleton (ie your Singleton does DO something, and has some state, doesn't it?)

Actually, that 'other member variable' stuff is really important. Important for your singleton - that's its real purpose right?, and important for our discussion. So let's give it a name: important_data. ie instance->important_data. And maybe instance->important_function(), which uses important_data. Etc.

As mentioned, let's assume the compiler has written the code such that these items are written in the order you are expecting, namely:

  1. important_data - written inside the ctor, called from

    instance = new Singleton(instanceCreated);

  2. instance - assigned right after new/ctor returns

  3. instanceCreated - inside setInstanceCreated()

Now, the CPU hands these writes off to the memory bus. Know what the memory bus does? IT REORDERS THEM. The CPU and architecture has the same constraints as the compiler - ie make sure this one CPU sees things consistently - ie single threaded consistent. So if, for example, instance and instanceCreated are on the same cache-line (highly likely, actually), they might be written together, and since they were just read, that cache-line is 'hot', so maybe they get written FIRST before important_data, so that that cache-line can be retired to make room for the cache-line where important_data lives.

Did you see that? instanceCreated and instance were just committed to memory BEFORE important_data. Note that CPU1 doesn't care, because it is living in a single-threaded world...

So now introduce CPU2:

CPU2 comes in, sees instanceCreated == true and instance != NULL and thus goes off and decides to call Singleton::Instance()->important_function(), which uses important_data, which is uninitialized. CRASH BANG BOOM.

By the way, it gets worse. So far, we've seen that the compiler could reorder, but we're pretending it didn't. Let's go one step further and pretend that CPU1 did NOT reorder any of the memory writing. Are we OK now?

No. Of course not.

Just as CPU1 decided to optimize/reorder its memory writes, CPU2 can REORDER ITS READS!

CPU2 comes in and sees

if (!instanceCreated) ...

so it needs to read instanceCreated. Ever heard of 'speculative execution'? (Great name for a FPS game, by the way). If the memory bus isn't busy doing anything, CPU2 might pre-read some other values 'hoping' that instanceCreated is true. ie it may pre-read important_data for example. Maybe important_data (or the uninitialized, possibly re-claimed-by-the-allocator memory that will become important_data) is already in CPU2's cache. Or maybe (more likely?) CPU2 just free'd that memory, and the allocator wrote NULL in its first 4 bytes (allocators often use that memory for their free-lists), so actually, the memory soon-to-become important_data may actually still be in the write queue of CPU2. In that case, why would CPU2 bother re-reading that memory, when it hasn't even finished writing it yet!? (it wouldn't - it would just get the values from its write-queue.)

Did that make sense? If not, imagine that the value of instance (which is a pointer) is 0x17e823d0. What was that memory doing before it became (becomes) the Singleton? Is that memory still in the write-queue of CPU2?...

Or basically, don't even think about why it might want to do so, but realize that CPU2 might read important_data first, then instanceCreated second. So even though CPU1 may have wrote them in order CPU2 sees 'crap' in important_data, then sees true in instanceCreated (and who knows what in instance!). Again, CRASH BANG BOOM. Or BOOM CRASH BANG, since by now you realize that the order isn't guaranteed...


It's usually better to have a non-lazy singleton which does nothing in its constructor, and then in GetInstance do a thread-safe call once to a function which allocates any expensive resources. You're already creating a Mutex non-lazily, so why not just put the mutex and some kind of Pimpl in your Singleton object?

By the way, this is easier on Posix:

struct Singleton {
    static Singleton *GetInstance() {
        pthread_once(&control, doInit);
        return instance;
    }

private:
    static void doInit() {
        // slight problem: we can't throw from here, or fail
        try {
            instance = new Singleton();
        } catch (...) {
            // we could stash an error indicator in a static member,
            // and check it in GetInstance.
            std::abort();
        }
    }

    static pthread_once_t control;
    static Singleton *instance;
};

pthread_once_t Singleton::control = PTHREAD_ONCE_INIT;
Singleton *Singleton::instance = 0;

There do exist pthread_once implementations for Windows and other platforms.


If you wish to see an in-depth discussion of Singletons, the various policies about their lifetime and the thread safety issues, I can only recommend a good read: "Modern C++ Design" by Alexandrescu.

The implementation is presented on the web in Loki, find it here!

And yes, it does hold in a single header file. So I would really encourage you to at least grab the file and read it, and better yet read the book to have the full-blown reflection.


At global scope in your code:

/************************************************************************************
Keep track of the singleton object for possible deletion.

*/
Singleton* Singleton::_pInstance = Singleton::Instance();

This makes your implementation not lazy. Presumably you want to set _pInstance to NULL at global scope, and assign to it after you construct the singleton inside Instance() before you unlock the mutex.


More food for thought from Meyers & Alexandrescu, with Singleton being the specific target: C++ and the Perils of Double-Checked Locking. It's a bit of a prickly problem.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜