What is the most efficient way to make this code thread safe?
Some C++ library I'm working on features a simple tracing mechanism which can be activated to generate log files showing which functions were called and what arguments were passed. It basically boils down to a TRACE
macro being spilled all over the source of the library, and the macro expands to something like this:
typedef void(*TraceProc)( const char *msg );
/* Sets 'callback' to point to the trace procedure which actually prints the given
* message to some output channel, or to a null trace procedure which is a no-op when
* case the given source file/line position was disabled by the client.
*
* This function also registers the callback pointer in an internal data structure
* and resets it to zero in case the filtering configuration changed since the last
* invocation of updateTraceCallback.
*/
void updateTraceCallback( TraceProc *callback, const char *file, unsinged int lineno );
#define TRACE(msg) \
{ \
static TraceProc traceCallback = 0; \
if ( !traceCallback ) \
updateTraceCallback( &traceCallback, __FILE__, __LINE__ ); \
traceCallback( msg ); \
}
The idea is that people can just say TRACE("foo hit")
in their code and that will either
call a debug printing function or it will be a no-op. They can use some other API (which is not shown here) to configure that only TRACE
uses in locations (source file/line number) should be printed. This configuration can change at runtime.
The issue with this is that this idea should now be used in a multi-threaded code base. Hence, the code which TRACE
expands to needs to work correctly in the face of multiple threads of execution running the code simultaneously. There are about 20.000 different trace points in the code base right now and they are hit very often, so they should be rather efficient
What is the most efficient way to make this approach thread safe? I need a solution for Windows (XP and newer) and Linux. I'm afraid of doing excessive locking just to check whether the filter configuration changed (99% of the time a trace point is hit, the configuration didn't change). I'm open to larger changes to the macro, too. So instea开发者_C百科d of discussing mutex vs. critical section performance, it would also be acceptable if the macro just sent an event to an event loop in a different thread (assuming that accessing the event loop is thread safe) and all the processing happens in the same thread, so it's synchronized using the event loop.
UPDATE: I can probably simplify this question to:
If I have one thread reading a pointer, and another thread which might write to the variable (but 99% of the time it doesn't), how can I avoid that the reading thread needs to lock all the time?
You could implement a configuration file version variable. When your program starts it is set to 0. The macro can hold a static int that is the last config version it saw. Then a simple atomic comparison between the last seen and the current config version will tell you if you need to do a full lock and re-call updateTraceCallback();
.
That way, 99% of the time you'll only add an extra atomic op, or memory barrier or something simmilar, which is very cheap. 1% of the time, just do the full mutex
thing, it shouldn't affect your performance in any noticeable way, if its only 1% of the time.
Edit:
Some .h file:
extern long trace_version;
Some .cpp file:
long trace_version = 0;
The macro:
#define TRACE(msg)
{
static long __lastSeenVersion = -1;
static TraceProc traceCallback = 0;
if ( !traceCallback || __lastSeenVersion != trace_version )
updateTraceCallback( &traceCallback, &__lastSeenVersion, __FILE__, __LINE__ );
traceCallback( msg );
}
The functions for incrementing a version and updates:
static long oldVersionRefcount = 0;
static long curVersionRefCount = 0;
void updateTraceCallback( TraceProc *callback, long &version, const char *file, unsinged int lineno ) {
if ( version != trace_version ) {
if ( InterlockedDecrement( oldVersionRefcount ) == 0 ) {
//....free resources.....
//...no mutex needed, since no one is using this,,,
}
//....aquire mutex and do stuff....
InterlockedIncrement( curVersionRefCount );
*version = trace_version;
//...release mutex...
}
}
void setNewTraceCallback( TraceProc *callback ) {
//...aquire mutex...
trace_version++; // No locks, mutexes or anything, this is atomic by itself.
while ( oldVersionRefcount != 0 ) { //..sleep? }
InterlockedExchange( &oldVersionRefcount, curVersionRefCount );
curVersionRefCount = 0;
//.... and so on...
//...release mutex...
Of course, this is very simplified, since if you need to upgrade the version and the oldVersionRefCount > 0
, then you're in trouble; how to solve this is up to you, since it really depends on your problem. My guess is that in those situations, you could simply wait until the ref count is zero, since the amount of time that the ref count is incremented should be the time it takes to run the macro.
I still don't fully understand the question, so please correct me on anything I didn't get.
(I'm leaving out the backslashes.)
#define TRACE(msg)
{
static TraceProc traceCallback = NULL;
TraceProc localTraceCallback;
localTraceCallback = traceCallback;
if (!localTraceCallback)
{
updateTraceBallback(&localTraceCallback, __FILE__, __LINE__);
// If two threads are running this at the same time
// one of them will update traceCallback and get it overwritten
// by the other. This isn't a big deal.
traceCallback = localTraceCallback;
}
// Now there's no way localTraceCallback can be null.
// An issue here is if in the middle of this executing
// traceCallback gets null'ed. But you haven't specified any
// restrictions about this either, so I'm assuming it isn't a problem.
localTraceCallback(msg);
}
Your comment says "resets it to zero in case the filtering configuration changes at runtime" but am I correct in reading that as "resets it to zero when the filtering configuration changes"?
Without knowing exactly how updateTraceCallback implements its data structure, or what other data it's referring to in order to decide when to reset the callbacks (or indeed to set them in the first place), it's impossible to judge what would be safe. A similar problem applies to knowing what traceCallback does - if it accesses a shared output destination, for example.
Given these limitations the only safe recommendation that doesn't require reworking other code is to stick a mutex around the whole lot (or preferably a critical section on Windows).
I'm afraid of doing excessive locking just to check whether the filter configuration changed (99% of the time a trace point is hit, the configuration didn't change). I'm open to larger changes to the macro, too. So instead of discussing mutex vs. critical section performance, it would also be acceptable if the macro just sent an event to an event loop in a different thread (assuming that accessing the event loop is thread safe)
How do you think thread safe messaging between threads is implemented without locks?
Anyway, here's a design that might work:
The data structure that holds the filter must be changed so that it is allocated dynamically from the heap because we are going to be creating multiple instances of filters. Also, it's going to need a reference count added to it. You need a typedef something like:
typedef struct Filter
{
unsigned int refCount;
// all the other filter data
} Filter;
There's a singleton 'current filter' declared somewhere.
static Filter* currentFilter;
and initialised with some default settings.
In your TRACE macro:
#define TRACE(char* msg)
{
static Filter* filter = NULL;
static TraceProc traceCallback = NULL;
if (filterOutOfDate(filter))
{
getNewCallback(__FILE__, __LINE__, &traceCallback, &filter);
}
traceCallback(msg);
}
filterOutOfDate()
merely compares the filter with currentFilter to see if it is the same. It should be enough to just compare addresses. It does no locking.
getNewCallback()
applies the current filter to get the new trace function and updates the filter passed in with the address of the current filter. It's implementation must be protected with a mutex lock. Also, it decremetns the refCount of the original filter and increments the refCount of the new filter. This is so we know when we can free the old filter.
void getNewCallback(const char* file, int line, TraceProc* newCallback, Filter** filter)
{
// MUTEX lock
newCallback = // whatever you need to do
currentFilter->refCount++;
if (*filter != NULL)
{
*filter->refCount--;
if (*filter->refCount == 0)
{
// free filter and associated resources
}
}
*filter = currentFilter;
// MUTEX unlock
}
When you want to change the filter, you do something like
changeFilter()
{
Filter* newFilter = // build a new filter
newFilter->refCount = 0;
// MUTEX lock (same mutex as above)
currentFilter = newFilter;
// MUTEX unlock
}
If I have one thread reading a pointer, and another thread which might write to the variable (but 99% of the time it doesn't), how can I avoid that the reading thread needs to lock all the time?
From your code, it is OK to use the mutex inside the updateTraceCallback()
since it is going to be called very rarely (once per location). After taking the mutex, check whether the traceCallback
is already initialized: if yes, then other thread just did it for you and there is nothing to be done.
If updateTraceCallback()
would turn out to be a serious performance problem due to the collisions on the global mutex, then you can simply make an array of mutexes instead and use hashed value of the traceCallback
pointer as an index into the mutex array. That would spread locking over many mutexes and minimize number of collisions.
#define TRACE(msg) \
{ \
static TraceProc traceCallback = \
updateTraceBallback( &traceCallback, __FILE__, __LINE__ ); \
traceCallback( msg ); \
}
精彩评论