Matrix Multiplication Using win32 threads
I have no idea what's wrong with my code ... It always return zeros in all the elements. A hint of where is the problem would be great :)
#include <iostream>
#include <stdio.h>
#include <cstd开发者_运维知识库lib>
#include <ctime>
#include <windows.h>
using namespace std;
int nGlobalCount = 0;
int thread_index = 0;
int num_of_thr=5;
int a[4][4], b[4][4], c[4][4];
int i, j, k;
struct v {
int i; /*row*/
int j; /*column*/
};
DWORD ThreadProc (LPVOID lpdwThreadParam ) {
//
struct v *input = (struct v *)lpdwThreadParam;
int avg=4*4/num_of_thr;
int count=0;
for(int i = 0; i <= 3 ; i++) {
for(int j = 0; j <= 3; j++) {
int sum=0;
for ( k = 0 ; k <= 3; k++) {
sum=sum+((a[input->i][k])*(b[k][input->j]));
c[input->i][input->j]=sum;
count++;
}
}
}
//Print Thread Number
//printf ("Thread #: %d\n", *((int*)lpdwThreadParam));
//Reduce the count
return 0;
}
int main() {
// int x=0;
cout<<"enter no of threads : ";
cin>>num_of_thr;
DWORD ThreadIds[num_of_thr];
HANDLE ThreadHandles[num_of_thr];
//struct v {
// int i; /*row*/
// int j; /*column*/
//};
struct v data[num_of_thr];
int i , j , k;
for ( int i = 0 ; i <= 3; i++) {
for (int j = 0 ; j <= 3 ; j++) {
a[i][j] = rand() % 10;
b[i][j] = rand() % 10;
c[i][j] = 0;
}
}
for(int i = 0; i < num_of_thr/2; i++) {
for(int j = 0; j < num_of_thr/2; j++) {
data[thread_index].i = i;
data[thread_index].j = j;
ThreadHandles[thread_index] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)&ThreadProc, &data[thread_index], 0,&ThreadIds[thread_index]);
thread_index++;
}
}
WaitForMultipleObjects(num_of_thr, ThreadHandles, TRUE, INFINITE);
cout<<"The resultant matrix is "<<endl;
for ( i = 0 ; i < 4; i++) {
for ( j = 0 ; j < 4 ; j++)
cout<<c[i][j]<<" ";
cout<<endl;
}
for (int i=0; i<num_of_thr; i++)
CloseHandle(ThreadHandles[i]);
return 0;
}
At a GLANCE, your sum declaration in the loop looks sketchy.
for(int i = 0; i <= 3 ; i++) {
for(int j = 0; j <= 3; j++) {
for ( k = 0 ; k <= 3; k++)
{
int sum=sum+((a[input->i][k])*(b[k][input->j])); // this declaration seems wrong
c[input->i][input->j]=sum;
count++;
}
}
}
Each inner loop you redeclare sum, effectively making it 0. You might want to move the declaration up one or two loops from the assignment depending on what you are trying to achieve.
Do you realise that you have two separate sets of variables named a, b and c? One is local to the function main, and the other is a static for the whole program. I suspect that this is not what you intended. Try deleting the one that is local to main.
Martyn
A few things I found while poking about in addition to the other issues noted previously:
- What are you compiling this with? With VC++ 2010 it "works", as in it outputs non-zeroes, although it complains about the
DWORD ThreadIds[num_of_thr];
array declaration with a non-constant array size (I just madenum_of_thr
a constant and commented out thecin
to test it quickly). - Are you sure you are inputting a valid number of threads with
cin >> num_of_thr;
For example, ifnum_of_thr
was 0 this would explain the zeroes output. A simplecout
here fornum_of_thr
would be useful. - In your data initialization loop starting with
for(int i = 0; i < num_of_thr/2; i++) {
you are not correctly counting threads which will result in an array underflow or overflow. For example, ifnum_of_thr
is 5 thennum_of_thr/2
is 2 which results in initializing only the elements 0..3 leaving the last element uninitialized. An array underflow is technically ok although the laterCloseHandle()
call will fail when it tries to free an essentially random handle. If you enter a larger number of threads you will overflow all your arrays (try it withnum_of_thr=10
for example). - If it still doesn't work try removing the threading to see if the threading or code itself is the source of the problem. For example, you can call the
ThreadProc()
function manually in a loop instead of from within threads. Either trace through the program with a debugger or output logs to stdout/file (which would also work in the threading model). - Instead of a random source matrix I would use a few fixed values at first with a known result. This will make it easier to determine if the code is actually computing the correct result.
精彩评论