Implementing a string copy function in C
At a recent job interview, I was asked to implement my own string copy function. I managed to write code that I believe works to an extent. However, when I returned home to try the problem again, I realized that it was a lot more complex than I had thought. Here is the code I came up with:
#include <stdio.h>
#include <stdlib.h>
char * mycpy(char * d, char * s);
int main() {
int i;
char buffer[1];
mycpy(buffer, "hello world\n");
printf("%s", buffer);
return 0;
}
char * mycpy (char * destination, char * source) {
if (!destination || !source) return NULL;
char * tmp = destination;
while (*destination != NULL || *so开发者_高级运维urce != NULL) {
*destination = *source;
destination++;
source++;
}
return tmp;
}
I looked at some other examples online and found that since all strings in C are null-terminated, I should have read up to the null character and then appended a null character to the destination string before exiting.
However one thing I'm curious about is how memory is being handled. I noticed if I used the strcpy() library function, I could copy a string of 10 characters into a char array of size 1. How is this possible? Is the strcpy() function somehow allocating more memory to the destination?
Good interview question has several layers, to which to candidate can demonstrate different levels of understanding.
On the syntactic 'C language' layer, the following code is from the classic Kernighan and Ritchie book ('The C programming language'):
while( *dest++ = *src++ )
;
In an interview, you could indeed point out the function isn't safe, most notably the buffer on *dest
isn't large enough. Also, there may be overlap, i.e. if dest
points to the middle of the src
buffer, you'll have endless loop (which will eventually creates memory access fault).
As the other answers have said, you're overwriting the buffer, so for the sake of your test change it to:
char buffer[ 12 ];
For the job interview they were perhaps hoping for:
char *mycpy( char *s, char *t )
{
while ( *s++ = *t++ )
{
;
}
return s;
}
No, it's that strcpy()
isn't safe and is overwriting the memory after it, I think. You're supposed to use strncpy()
instead.
No, you're writing past the buffer and overwriting (in this case) the rest of your stack past buffer. This is very dangerous behavior.
In general, you should always create methods that supply limits. In most C libraries, these methods are denoted by an n
in the method name.
C does not do any run time bounds checking like other languages(C#,Java etc). That is why you can write things past the end of the array. However, you won't be able to access that string in some cases because you might be encroaching upon memory that doesn't belong to you giving you a segementation fault. K&R would be a good book to learn such concepts.
The strcpy()
function forgoes memory management entirely, therefore all allocation needs to be done before the function is called, and freed afterward when necessary. If your source string has more characters than the destination buffer, strcpy()
will just keep writing past the end of the buffer into unallocated space, or into space that's allocated for something else.
This can be very bad.
strncpy()
works similarly to strcpy()
, except that it allows you to pass an additional variable describing the size of the buffer, so the function will stop copying when it reaches this limit. This is safer, but still relies on the calling program to allocate and describe the buffer properly -- it can still go past the end of the buffer if you provide the wrong length, leading to the same problems.
char * mycpy (char * destination, char * source) {
if (!destination || !source) return NULL;
char * tmp = destination;
while (*destination != NULL || *source != NULL) {
*destination = *source;
destination++;
source++;
}
return tmp;
}
In the above copy implementation, your tmp and destination are having the same data. Its better your dont retrun any data, and instead let the destination be your out parameter. Can you rewrite the same.
The version below works for me. I'm not sure if it is bad design though:
while(source[i] != '\0' && (i<= (MAXLINE-1)))
{
dest[i]=source[i];
++i;
}
In general it's always a good idea to have const
modifier where it's possible, for example for the source parameter.
精彩评论