pthread_cond_wait not waking up from pthread_cond_broadcast
in my program there's a part of code that waits to be waken up from other part of code:
Here's the part that goes to sleep:void flush2device(int task_id) {
if (pthread_mutex_lock(&id2cvLock) != SUCCESS) {
cerr << "system error - exiting!!!\n";
exit(1);
}
map<int,pthread_cond_t*>::iterator it;
it = id2cv.find(task_id);
if(it == id2cv.end()){
if (pthread_mutex_unlock(&id2cvLock) != SUCCESS) {
cerr << "system error\n UNLOCKING MUTEX flush2device\n";
exit(1);
}
return;
}
cout << "Waiting for CV signal" <<endl;
if(pthread_cond_wait(it->second, &id2cvLock)!=SUCCESS){
cerr << "system error\n COND_WAIT flush2device - exiting!!!\n";
exit(1);
}
cout << "should be right after " << task_id << " signal" << endl;
if (pthread_mutex_unlock(&id2cvLock) != SUCCESS) {
cerr << "system error\n UNLOCKING MUTEX flush2device -exiting!!!\n";
exit(1);
}
}
In another part of code, there's the waking up part (signaling)://id2cv is a map <int, pthread_cond_t*> variable. - the value is a pointer to the cv on
//wh开发者_运维百科ich we call with the broadcast method.
if(pthread_mutex_lock(&id2cvLock)!=SUCCESS){
cerr <<"system error\n";
exit(1);
}
id2cv.erase(nextBuf->_taskID);
cout << "In Thread b4 signal, i'm tID " <<nextBuf->_taskID << endl;
if (pthread_cond_broadcast(nextBuf->cv) != 0) {
cerr << "system error SIGNAL_CV doThreads\n";
exit(1);
}
cout << "In doThread, after erasing id2cv " << endl;
if(pthread_mutex_unlock(&id2cvLock)!=SUCCESS){
cerr <<"system error\n;
exit(1);
}
Most of the runnings work just fine, but once in a while the program just stop "reacting" - the first method (above) just doesn't pass the cond_wait part - it seems like no one really send her the signal on time (or from some other reason) - while the other method (which the last part of code is a part of it) keeps running.
Where do i go wrong in the logic of mutexes and signaling? I've already checked that the pthread_cond_t variable is still "alive" before the calling to the cond_wait and the cond_broadcast method, and nothing in that area seems to be the fault.
Despite it's name, pthread_cond_wait
is an unconditional wait for a condition. You must not call pthread_cond_wait
unless you have confirmed that there is something to wait for, and the thing it's waiting for must be protected by the associated mutex.
Condition variables are stateless and it is the application's responsibility to store the state of the thing being waited for, called a 'predicate'.
The canonical pattern is:
pthread_mutex_lock(&mutex);
while(!ready_for_me_to_do_something)
pthread_cond_wait(&condvar, &mutex);
do_stuff();
ready_for_me_to_do_something=false; // this may or may not be appropriate
pthread_mutex_unlock(&mutex);
and:
pthread_mutex_lock(&mutex);
ready_for_me_to_do_something=true;
pthread_cond_broadcast(&condvar);
pthread_mutex_unlock(&mutex);
Notice how this code maintains the state in the ready_for_me_to_do_something
variable and the waiting thread waits in a loop until that variable is true. Notice how the mutex protects that shared variable, and it protects the condition variable (because that is also shared between the threads).
This is not the only correct way to use a condition variable, but it is very easy to run into trouble with any other use. You call pthread_cond_wait
even if there is no reason to wait. If you wait for your sister to get home with the car before you use it, and she has already returned, you will be waiting a long time.
Your use of pthread_cond_wait()
is not correct. If a condition variable is signalled while no processes are waiting, the signal has no effect. It's not saved for the next time a process waits. This means that correct use of pthread_cond_wait()
looks like:
pthread_mutex_lock(&mutex);
/* ... */
while (!should_wake_up)
pthread_cond_wait(&cond, &mutex);
The should_wake_up
condition might just be a simple test of a flag variable, or it might be something like a more complicated test for a buffer being empty or full, or something similar. The mutex must be locked to protect against concurrent modifications that might change the result of should_wake_up
.
It is not clear what that test should be in your program - you might need to add a specific flag variable.
I don't think there's enough code in the "waking up" part, but my initial guess is that the pthread_cond_wait
hasn't been entered at the time pthread_cond_broadcast
is issued.
Another possibility is that pthread_cond_wait
is in the middle of a spurious wakeup and misses the signal completely.
I'm pretty sure that most uses of condition variables also have an external predicate that must be checked after every wakeup to see if there is work to be done.
精彩评论