efficient indexing of an array
Suppose that I have
z[7]={0, 0, 2, 0, 1, 2, 1}
that means- first observation allocated in group 0, 开发者_运维问答second obs group 0, third group 2 etc and I want to write an efficient code to get an array of 3X? such that in the first row I have all the observations allocated in the first group, second row all the obs allocated in the second group etc.. something like
0, 1, 3
4, 6
2, 5
and this must be general, maybe I could have
z={0, 0, 2, 0, 1, 2, 1, 3, 4, 2, 0, 4, 5, 5, 6, 7, 0}
so the number of columns is unknown
I did do my homework and the code is attached to this message, but there must be a better way to do it. I believe that with pointers but I really do not know how.
#include <stdio.h>
int main(){
int z[7]={0, 0, 2, 0, 1, 2, 1}, nj[3], iz[3][7], ip[3], i, j;
for(j=0; j<3; j++){
ip[j] = 0;
nj[j] = 0;
}
for(i=0; i <7; i++ ){
nj[z[i]] = nj[z[i]] + 1;
iz[z[i]][ip[z[i]]] = i;
ip[z[i]] = ip[z[i]] + 1;
}
for(j=0; j<3 ;j++){
for(i=0; i < nj[j]; i++){
printf("%d\t", iz[j][i]);
}
printf("\n");
}
return 0;
}
It seems that you have two tasks here.
- To count the number of occurrences of each index in
z
and allocate a data structure of the right size and configuration. - To iterate over the data and copy it to the correct places.
At the moment you appear to have solved (1) naively by allocating a big, two dimensional array iz
. That works fine if you know in advance the limits on how big it could be (and your machine will have enough memory), no need to fix this until later.
It is not clear to me exactly how (2) should be approached. Is the data currently going into iz
guaranteed to consist of [0, 1, ... n ]?
If you don't know the limits of the size of iz
in advance, then you will have to allocate a dynamic structure. I'd suggest a ragged array though this means two (or even three) passes over z
.
What do I mean by a ragged array? A object like the argv
argument to main
, but in this case of type int **
. In memory it looks like this:
+----+ +---+ +---+---+---+--
| iz |---->| |---->| | | ...
+----+ +---+ +---+---+---+--
| |--
+---+ \ +---+---+---+--
| . | --->| | | ...
. +---+---+---+--
.
A ragged array can be accessed with iz[][]
just like it was a two-dimensional array (but it is a different type of object), which is nice for your purposes because you can tune your algorithm with the code you have now, and then slap one of these in place.
How to set it up.
Iterate of
z
to find the largest number,maxZ
, present.Allocate an array of
int*
of sizemaxZ+1
:iz=callac(maxZ+1,sizeof(int*));
.I chose
calloc
because it zeros the memory, which makes all those pointersNULL
, but you could usemalloc
and NULL them yourself. Making the array one too big gives us a NULL termination, which may be useful later.Allocate an array of counters of size
maxZ
:int *cz = calloc(maxZ,sizeof(int));
iterate over
z
, fillingcz
with the number of entries needed in each row.For each row, allocate an array of ints:
for(i=0; i<maxZ; ++i){ iz[i] = malloc(sizeof(int)*cz[i]; }
Iterate over
z
one last time, sticking the figures intoiz
as you already do. You could re-usecz
at this point to keep track of how many figure have already been put into each row, but you might want to allocate a separate array for that purpose because so that you have a record of how big each allocated array was.
NB: Every call to malloc
or calloc
ought to be accompanied by a check to insure that the allocation worked. I've left that as an exercise for the student.
This repeated passes over z
business can be avoided entirely by using dynamic arrays, but I suspect you don't need that and don't want the added complexity.
精彩评论