开发者

Memcpy, string and terminator

I have to write a function that fills a char* buffer for an assigned length with the content of a string. If the string is too long, I just have to cut it. The buffer is not allocated by me but by the user of my function. I tried something like this:

int writebuff(char* buffer, int length){
    string text="123456789012345";
    memcpy(buffer, text.c_str(),length);
    //buffer[length]='\0';
    return 1;
}


int main(){
    char* buffer = new char[10];
    writebuff(buffer,10);
    cout << "After: "<<buffer<<endl;
}

my question is about the terminator: should it be there or not? This function is used in a much wider code and sometimes it seems I get problems with strange characters when th开发者_JAVA技巧e string needs to be cut.

Any hints on the correct procedure to follow?


A C-style string must be terminated with a zero character '\0'.

In addition you have another problem with your code - it may try to copy from beyond the end of your source string. This is classic undefined behavior. It may look like it works, until the one time that the string is allocated at the end of a heap memory block and the copy goes off into a protected area of memory and fails spectacularly. You should copy only until the minimum of the length of the buffer or the length of the string.

P.S. For completeness here's a good version of your function. Thanks to Naveen for pointing out the off-by-one error in your terminating null. I've taken the liberty of using your return value to indicate the length of the returned string, or the number of characters required if the length passed in was <= 0.

int writebuff(char* buffer, int length)
{
    string text="123456789012345";
    if (length <= 0)
        return text.size();
    if (text.size() < length)
    {
        memcpy(buffer, text.c_str(), text.size()+1);
        return text.size();
    }
    memcpy(buffer, text.c_str(), length-1);
    buffer[length-1]='\0';
    return length-1;
}


If you want to treat the buffer as a string you should NULL terminate it. For this you need to copy length-1 characters using memcpy and set the length-1 character as \0.


it seems you are using C++ - given that, the simplest approach is (assuming that NUL termination is required by the interface spec)

int writebuff(char* buffer, int length)
{
  string text = "123456789012345";
  std::fill_n(buffer, length, 0); // reset the entire buffer
  // use the built-in copy method from std::string, it will decide what's best.
  text.copy(buffer, length);
  // only over-write the last character if source is greater than length
  if (length < text.size())
    buffer[length-1] = 0;
  return 1; // eh?
}


char * Buffers must be null terminated unless you are explicitly passing out the length with it everywhere and saying so that the buffer is not null terminated.


Whether or not you should terminate the string with a \0 depends on the specification of your writebuff function. If what you have in buffer should be a valid C-style string after calling your function, you should terminate it with a \0.

Note, though, that c_str() will terminate with a \0 for you, so you could use text.size() + 1 as the size of the source string. Also note that if length is larger than the size of the string, you will copy further than what text provides with your current code (you can use min(length - 2, text.size() + 1/*trailing \0*/) to prevent that, and set buffer[length - 1] = 0 to cap it off).

The buffer allocated in main is leaked, btw


my question is about the terminator: should it be there or not?

Yes. It should be there. Otherwise how would you later know where the string ends? And how would cout would know? It would keep printing garbage till it encounters a garbage whose value happens to be \0. Your program might even crash.

As a sidenote, your program is leaking memory. It doesn't free the memory it allocates. But since you're exiting from the main(), it doesn't matter much; after all once the program ends, all the memory would go back to the OS, whether you deallocate it or not. But its good practice in general, if you don't forget deallocating memory (or any other resource ) yourself.


I agree with Necrolis that strncpy is the way to go, but it will not get the null terminator if the string is too long. You had the right idea in putting an explicit terminator, but as written your code puts it one past the end. (This is in C, since you seemed to be doing more C than C++?)

int writebuff(char* buffer, int length){
    char* text="123456789012345";
    strncpy(buffer, text, length);
    buffer[length-1]='\0';
   return 1;
}


It should most defiantly be there*, this prevents strings that are too long for the buffer from filling it completely and causing an overflow later on when its accessed. though imo, strncpy should be used instead of memcpy, but you'll still have to null terminate it. (also your example leaks memory).

*if you're ever in doubt, go the safest route!


First, I don't know whether writerbuff should terminate the string or not. That is a design question, to be answered by the person who decided that writebuff should exist at all.

Second, taking your specific example as a whole, there are two problems. One is that you pass an unterminated string to operator<<(ostream, char*). Second is the commented-out line writes beyond the end of the indicated buffer. Both of these invoke undefined behavior.

(Third is a design flaw -- can you know that length is always less than the length of text?)

Try this:

int writebuff(char* buffer, int length){
  string text="123456789012345";
  memcpy(buffer, text.c_str(),length);
  buffer[length-1]='\0';
  return 1;
}


int main(){
  char* buffer = new char[10];
  writebuff(buffer,10);
  cout << "After: "<<buffer<<endl;
}


  1. In main(), you should delete the buffer you allocated with new., or allocate it statically (char buf[10]). Yes, it's only 10 bytes, and yes, it's a memory "pool," not a leak, since it's a one-time allocations, and yes, you need that memory around for the entire running time of the program. But it's still a good habit to be into.

  2. In C/C++ the general contract with character buffers is that they be null-terminiated, so I would include it unless I had been explicitly told not to do it. And if I did, I would comment it, and maybe even use a typedef or name on the char * parameter indicating that the result is a string that is not null terminated.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜