开发者

C / C++ How to copy a multidimensional char array without nested loops?

I'm looking for a smart way to copy a multidimensional char array to a new destination. I want to duplicate the char array because I want to edit the content without changing the source array.

I could build nested loops to copy every char by hand but I hope there is a better way.

Update:

I don't have the size of the 2. level dimension. Given is o开发者_开发知识库nly the length (rows).

The code looks like this:

char **tmp;
char **realDest;

int length = someFunctionThatFillsTmp(&tmp);

//now I want to copy tmp to realDest

I'm looking for a method that copies all the memory of tmp into free memory and point realDest to it.

Update 2:

someFunctionThatFillsTmp() is the function credis_lrange() from the Redis C lib credis.c.

Inside the lib tmp is created with:

rhnd->reply.multibulk.bulks = malloc(sizeof(char *)*CR_MULTIBULK_SIZE)

Update 3:

I've tried to use memcpy with this lines:

int cb = sizeof(char) * size * 8; //string inside 2. level has 8 chars
memcpy(realDest,tmp,cb);
cout << realDest[0] << endl;

prints: mystring

But I'm getting a: Program received signal: EXC_BAD_ACCESS


You could use memcpy.

If the multidimensional array size is given at compile time, i.e mytype myarray[1][2], then only a single memcpy call is needed

memcpy(dest, src, sizeof (mytype) * rows * columns);

If, like you indicated the array is dynamically allocated, you will need to know the size of both of the dimensions as when dynamically allocated, the memory used in the array won't be in a contiguous location, which means that memcpy will have to be used multiple times.

Given a 2d array, the method to copy it would be as follows:

char** src;
char** dest;

int length = someFunctionThatFillsTmp(src);
dest = malloc(length*sizeof(char*));

for ( int i = 0; i < length; ++i ){
    //width must be known (see below)
    dest[i] = malloc(width);

    memcpy(dest[i], src[i], width);
}

Given that from your question it looks like you are dealing with an array of strings, you could use strlen to find the length of the string (It must be null terminated).

In which case the loop would become

for ( int i = 0; i < length; ++i ){
    int width = strlen(src[i]) + 1;
    dest[i] = malloc(width);    
    memcpy(dest[i], src[i], width);
}


When you have a pointer to a pointer in C, you have to know how the data is going to be used and laid out in the memory. Now, the first point is obvious, and true for any variable in general: if you don't know how some variable is going to be used in a program, why have it? :-). The second point is more interesting.

At the most basic level, a pointer to type T points to one object of type T. For example:

int i = 42;
int *pi = &i;

Now, pi points to one int. If you wish, you can make a pointer point to the first of many such objects:

int arr[10];
int *pa = arr;
int *pb = malloc(10 * sizeof *pb);

pa now points to the first of a sequence of 10 (contiguous) int values, and assuming that malloc() succeeds, pb points to the first of another set of 10 (again, contiguous) ints.

The same applies if you have a pointer to a pointer:

int **ppa = malloc(10 * sizeof *ppa);

Assuming that malloc() succeeds, now you have ppa pointing to the first of a sequence of 10 contiguous int * values.

So, when you do:

char **tmp = malloc(sizeof(char *)*CR_MULTIBULK_SIZE);

tmp points to the first char * object in a sequence of CR_MULTIBULK_SIZE such objects. Each of the pointers above is not initialized, so tmp[0] to tmp[CR_MULTIBULK_SIZE-1] all contain garbage. One way to initialize them would be to malloc() them:

size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i)
    tmp[i] = malloc(...);

The ... above is the size of the ith data we want. It could be a constant, or it could be a variable, depending upon i, or the phase of the moon, or a random number, or anything else. The main point to note is that you have CR_MULTIBULK_SIZE calls to malloc() in the loop, and that while each malloc() is going to return you a contiguous block of memory, the contiguity is not guaranteed across malloc() calls. In other words, the second malloc() call is not guaranteed to return a pointer that starts right where the previous malloc()'s data ended.

To make things more concrete, let's assume CR_MULTIBULK_SIZE is 3. In pictures, your data might look like this:

     +------+                                          +---+---+
tmp: |      |--------+                          +----->| a | 0 |
     +------+        |                          |      +---+---+
                     |                          |
                     |                          |
                     |         +------+------+------+
                     +-------->|  0   |  1   |  2   |
                               +------+------+------+
                                   |      |
                                   |      |    +---+---+---+---+---+
                                   |      +--->| t | e | s | t | 0 |
                            +------+           +---+---+---+---+---+
                            |
                            |
                            |    +---+---+---+
                            +--->| h | i | 0 |
                                 +---+---+---+

tmp points to a contiguous block of 3 char * values. The first of the pointers, tmp[0], points to a contiguous block of 3 char values. Similarly, tmp[1] and tmp[2] point to 5 and 2 chars respectively. But the memory pointed to by tmp[0] to tmp[2] is not contiguous as a whole.

Since memcpy() copies contiguous memory, what you want to do can't be done by one memcpy(). Further, you need to know how each tmp[i] was allocated. So, in general, what you want to do needs a loop:

char **realDest = malloc(CR_MULTIBULK_SIZE * sizeof *realDest);
/* assume malloc succeeded */
size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i) {
    realDest[i] = malloc(size * sizeof *realDest[i]);
    /* again, no error checking */
    memcpy(realDest[i], tmp[i], size);
}

As above, you can call memcpy() inside the loop, so you don't need nested loop in your code. (Most likely memcpy() is implemented with a loop, so the effect is as if you had nested loops.)

Now, if you had code like:

char *s = malloc(size * CR_MULTIBULK_SIZE * sizeof *s);
size_t i;
for (i=0; i < CR_MULTIBULK_SIZE; ++i)
    tmp[i] = s + i*CR_MULTIBULK_SIZE;

I.e., you allocated contiguous space for all the pointers in one malloc() call, then you can copy all the data without a loop in your code:

size_t i;
char **realDest = malloc(CR_MULTIBULK_SIZE * sizeof *realDest);
*realDest = malloc(size * CR_MULTIBULK_SIZE * sizeof **realDest);
memcpy(*realDest, tmp[0], size*CR_MULTIBULK_SIZE);

/* Now set realDest[1]...realDest[CR_MULTIBULK_SIZE-1] to "proper" values */
for (i=1; i < CR_MULTIBULK_SIZE; ++i)
    realDest[i] = realDest[0] + i * CR_MULTIBULK_SIZE;

From the above, the simple answer is, if you had more than one malloc() to allocate memory for tmp[i], then you will need a loop to copy all the data.


You can just calculate the overall size of the array and then use memcpy to copy it.

int cb = sizeof(char) * rows * columns;
memcpy (toArray, fromArray, cb);

Edit: new information in the question indicates that the number of rows and cols of the array is not known, and that the array may be ragged, so memcpy may not be a solution.


Lets explore some possibilities for what's going on here:

int main(int argc; char **argv){
  char **tmp1;         // Could point any where
  char **tmp2 = NULL;
  char **tmp3 = NULL;
  char **tmp4 = NULL;
  char **tmp5 = NULL;
  char **realDest;

  int size = SIZE_MACRO; // Well, you never said
  int cb = sizeof(char) * size * 8; //string inside 2. level has 8 chars

  /* Case 1: did nothing with tmp */
  memcpy(realDest,tmp,cb);  // copies 8*size bytes from WHEREEVER tmp happens to be
                          // pointing. This is undefined behavior and might crash.
  printf("%p\n",tmp[0]);    // Accesses WHEREEVER tmp points+1, undefined behavior, 
                            // might crash.
  printf("%c\n",tmp[0][0]); // Accesses WHEREEVER tmp points, undefined behavior, 
                            // might crash. IF it hasn't crashed yet, derefernces THAT
                            // memory location, ALSO undefined behavior and 
                            // might crash


  /* Case 2: NULL pointer */
  memcpy(realDest,tmp2,cb);  // Dereferences a NULL pointer. Crashes with SIGSEGV
  printf("%p\n",tmp2[0]);    // Dereferences a NULL pointer. Crashes with SIGSEGV
  printf("%c\n",tmp2[0][0]); // Dereferences a NULL pointer. Crashes with SIGSEGV


  /* Case 3: Small allocation at the other end */
  tmp3 = calloc(sizeof(char*),1); // Allocates space for ONE char*'s 
                                  // (4 bytes on most 32 bit machines), and 
                                  // initializes it to 0 (NULL on most machines)
  memcpy(realDest,tmp3,cb);  // Accesses at least 8 bytes of the 4 byte block: 
                             // undefined behavior, might crash
  printf("%p\n",tmp3[0]);    // FINALLY one that works. 
                             // Prints a representation of a 0 pointer   
  printf("%c\n",tmp3[0][0]); // Derefereces a 0 (i.e. NULL) pointer. 
                             // Crashed with SIGSEGV


  /* Case 4: Adequate allocation at the other end */
  tmp4 = calloc(sizeof(char*),32); // Allocates space for 32 char*'s 
                                  // (4*32 bytes on most 32 bit machines), and 
                                  // initializes it to 0 (NULL on most machines)
  memcpy(realDest,tmp4,cb);  // Accesses at least 8 bytes of large block. Works.
  printf("%p\n",tmp3[0]);    // Works again. 
                             // Prints a representation of a 0 pointer   
  printf("%c\n",tmp3[0][0]); // Derefereces a 0 (i.e. NULL) pointer. 
                             // Crashed with SIGSEGV


  /* Case 5: Full ragged array */
  tmp5 = calloc(sizeof(char*),8); // Allocates space for 8 char*'s
  for (int i=0; i<8; ++i){
    tmp5[i] = calloc(sizeof(char),2*i); // Allocates space for 2i characters
    tmp5[i][0] = '0' + i;               // Assigns the first character a digit for ID
  }
  // At this point we have finally allocated 8 strings of sizes ranging 
  // from 2 to 16 characters.
  memcpy(realDest,tmp5,cb);  // Accesses at least 8 bytes of large block. Works.
                             // BUT what works means is that 2*size elements of 
                             // realDist now contain pointer to the character 
                             // arrays allocated in the for block above/
                             //
                             // There are still only 8 strings allocated
  printf("%p\n",tmp5[0]);    // Works again. 
                             // Prints a representation of a non-zero pointer   
  printf("%c\n",tmp5[0][0]); // This is the first time this has worked. Prints "0\n"
  tmp5[0][0] = '*';
  printf("%c\n",realDest[0][0]); // Prints "*\n", because realDest[0] == tmp5[0],
                                 // So the change to tmp5[0][0] affects realDest[0][0]

  return 0;
}

The moral of the story is: you must to know what is on the other side of your pointers. Or else.

The second moral of the story is: just because you can access a double pointer using the [][] notation does not make it is the same as two-dimensional array. Really.


Let me clarify the second moral a little bit.

An array (be it one dimensional, two dimensional, whatever) is an allocated piece of memory, and the compiler knows how big it is (but never does any range checking for you), and a what address it starts. You declare arrays with

char string1[32];
unsigned int histo2[10][20];

and similar things;

A pointer is a variable that can hold a memory address. You declare pointers with

char *sting_ptr1;
double *matrix_ptr = NULL;

They are two different things.

But:

  1. If you use the [] syntax with a pointer, the compiler will do pointer arithmetic for you.
  2. In almost any place you use an array without dereferencing it, the compiler treats it as a pointer to the arrays start location.

So, I can do

    strcpy(string1,"dmckee");

because rule 2 says that string1 (an array) is treated as a char*). Likewise, I can fllow that with:

    char *string_ptr2 = string1;

Finally,

    if (string_ptr[3] == 'k') {
      prinf("OK\n");
    }

will print "OK" because of rule 1.


Why are you not using C++?

class C
{
    std::vector<std::string> data;
public:
    char** cpy();
};

char** C::cpy()
{
    std::string *psz = new std::string [data.size()];
    copy(data.begin(), data.end(), psz);
    char **ppsz = new char* [data.size()];
    for(size_t i = 0; i < data.size(); ++i)
    {
        ppsz[i] = new char [psz[i].length() + 1];
        ppsz[i] = psz[i].c_str();
    }
    delete [] psz;
    return(ppsz);
}

Or something similar? Also, do you need to use C-strings? I doubt it.


Note that in the following example:

char **a;

a[i] is char*. So if you do a memcpy() of a, you're doing a shallow copy of that pointer.

I would ditch the multi-dimensional aspect and go with a flat buffer of size nn. You can simulate A[i][j] with A[i + jwidth]. Then you can memcpy(newBuffer, oldBuffer, width * height * sizeof(*NewBuffer)).


As others suggested, it looks like this is an array of pointers rather than a multi demetional array.

so instead of it being

char mdArray[10][10];

it is:

char* pArray[10];

if that is the case the only thing you can do is loop through with the one length value you get, if there are meant to be strings (which it looks like it is) then use strlen in which case it would be:

char **tmp;

int length = getlengthfromwhereever;

char** copy = new char*[length];

for(int i=0; i<length; i++)
{
    int slen = strlen(tmp[i]);
    copy[i] = new char[slen+1]; //+1 for null terminator
    memcpy(copy[i],tmp[i],slen);
    copy[i][slen] = 0; // you could just copy slen+1 to copy the null terminator, but there might not be one...
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜