pthreads program works for some time and then stalls
There is a program I am working on that, after I launch it, works for some time and then stalls. Here is a simplified version of the program:
#include <cstdlib>
#include <iostream>
#include <pthread.h>
pthread_t* thread_handles;
pthread_mutex_t mutex;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
int thread_count;
const int some_count = 77;
const int numb_count = 5;
int countR = 0;
//Initialize threads
void InitTh(char* arg[]){
/* Get number of threads */
thread_count = strtol(arg[1], NULL, 10);
/*Allocate space for threads*/
thread_handles =(pthread_t*) malloc (thread_count*sizeof(pthread_t));
}
//Terminate threads
void TermTh(){
for(long thread = 0; thread < thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
}
void* DO_WORK(void* replica) {
/*Does something*/
pthread_mutex_lock(&mutex);
countR++;
if (countR == numb_count) pthread_cond_broadcast(&cond_var);
pthread_mutex_unlock(&mutex);
}
//Some function
void FUNCTION(){
pthread_mutex_init(&mutex, NULL);
for(int k = 0; k < some_count; k++){
for(int j = 0; j < numb_count; j++){
long thread = (long) j % thread_count;
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);;
}
/*Wait for threads to finish their jobs*/
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
pthread_mutex_unlock(&mutex);
/*Does more work*/
}
pthread_cond_destroy(&cond_var);
pthread_mutex_destroy(&mutex);
}
int ma开发者_如何转开发in(int argc, char* argv[]) {
/*Initialize threads*/
InitTh(argv);
/*Do some work*/
FUNCTION();
/*Treminate threads*/
TermTh();
return 0;
}
When some_count
, (in my particular case,) is less than 76, the program works fine, but if I specify a larger value the program, as mentioned earlier, works for some time and then stalls. Maybe somebody can point out what I am doing wrong?
In
long thread = (long) j % thread_count;
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);;
you can "override" initialized thread handles, depending on your actual thread count parameter.
I think you should init the thread number to numb_count
rather then argv
then replace
long thread = (long) j % thread_count;
with
long thread = (long) j;
won't sure it fix it, but it's needed anyway...
Moreover, it's not about the number 76 or 77, you have a race condition in the thread use. lets say that one of you threads got to the point in "DO_WORK" when he unlock the mutex but he still didn't returned from this function (meaning the thread is still running...). then you may try to create the same thread in the next iteration using:
pthread_create(&thread_handles[thread], NULL, DO_WORK, (void *)j);
fixing, change:
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
pthread_mutex_unlock(&mutex);
to:
pthread_mutex_lock(&mutex);
if (countR < numb_count) while(pthread_cond_wait(&cond_var,&mutex) != 0);
countR = 0;
for(long thread = 0; thread < numb_count; thread++)
pthread_join(thread_handles[thread], NULL);
pthread_mutex_unlock(&mutex);
You could try to analyze it using helgrind.
Install valgrind, then launch valgrind --tool=helgrind yourproject and see what helgrind spits out
You are neither initializing your mutex correctly (not causing the error here), nor storing the threads you create correctly. Try this:
for(int count = 0; count < thread_count; ++count) {
pthread_create(&thread_handles[count], NULL, DO_WORK, (void *)(count % numb_count));
}
精彩评论