How do I make a multi-threaded app use all the cores on Ubuntu under VMWare?
I have got a multi-threaded app that process a very large data file. Works great on Window 7, the code is all C++, uses the pthreads library for cross-platform multi-threading. When I run it under Windows on my Intel i3 - Task manager shows all four cores pegged to the limit, which is what I want. Compiled the same code using g++ Ubuntu/VMWare workstation - same number of threads are launched, but all threads are running on one core (as far as I can tell - Task Manager only shows one core busy).
I'm going to dive into the pThreads calls - perhaps I missed some default setting - but if anybody has any idea, I'd like to hear them, and I can give more info -
Update: I did setup VMWare to see all four cores and /proc/cpuinfo shows 4 cores
Update 2 - just wrote a simple app to show the problem - maybe it's VMWare only? - any Linux natives out there want 开发者_开发技巧to try and see if this actually loads down multiple cores? To run this on Windows you will need the pThread library - easily downloadable. And if anyone can suggest something more cpu intensive than printf- go ahead!
#ifdef _WIN32
#include "stdafx.h"
#endif
#include "stdio.h"
#include "stdlib.h"
#include "pthread.h"
void *Process(void *data)
{
long id = (long)data;
for (int i=0;i<100000;i++)
{
printf("Process %ld says Hello World\n",id);
}
return NULL;
}
#ifdef _WIN32
int _tmain(int argc, _TCHAR* argv[])
#else
int main(int argc, char* argv[])
#endif
{
int numCores = 1;
if (argc>1)
numCores = strtol(&argv[1][2],NULL,10);
pthread_t *thread_ids = (pthread_t *)malloc(numCores*sizeof(pthread_t));
for (int i=0;i<numCores;i++)
{
pthread_create(&thread_ids[i],NULL,Process,(void *)i);
}
for (int i=0;i<numCores;i++)
{
pthread_join(thread_ids[i],NULL);
}
return 0;
}
I changed your code a bit. I changed numCores = strtol(&argv[1][2], NULL, 10);
to numCores = strtol(&argv[1][0], NULL, 10);
to make it work under Linux by calling ./core 4
maybe you where passing something in front of the number of cores, or because type _TCHAR is 3byte per char? Not that familiar with windows.. Further more since I wasn't able to stress the CPU with only printf I also changed Process a bit.
void *Process(void *data)
{
long hdata = (long)data;
long id = (long)data;
for (int i=0;i<10000000;i++)
{
printf("Process %ld says Hello World\n",id);
for (int j = 0; j < 100000; j++)
{
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
hdata *= j;
...
}
}
return (void*)hdata;
}
And now when I run gcc -O2 -lpthread -std=gnu99 core.c -o core && ./core 4
You can see that all 4 threads are running on 4 different cores well probably the are swapped from core to core at a time but all 4 cores are working overtime.
core.c: In function ‘main’:
core.c:75:50: warning: cast to pointer from integer of different size [-Wint-to-pointer- cast]
Starting 4 threads
Process 0 says Hello World
Process 1 says Hello World
Process 3 says Hello World
Process 2 says Hello World
I verified it with htop
hope it helps.. :) I run dedicated Debian SID x86_64 with 4cores in case you're wondering.
精彩评论