开发者

Quick strlen question

I've come to bother you all with another probably really simple C question.

Using the following code:

int get_len(char *string){

    printf("len: %lu\n", strlen(string));

    return 0;
}

int main(){

    char *x = "test";
    char y[4] = {'t','e','s','t'};

    get_len(x); // len: 4
    get_len(y); // len: 6

    return 0;
}

2 questions. Why are they different and why is y 6? T开发者_运维知识库hanks guys.

EDIT: Sorry, I know what would fix it, I kind of just wanted to understand what was going on. So does strlen just keep forwarding the point till it happens to find a \0? Also when I did strlen in the main function instead of in the get_len function both were 4. Was that just a coincidence?


y is not null-terminated. strlen() counts characters until it hits a null character. Yours happened to find one after 6, but it could be any number. Try this:

char y[] = {'t','e','s','t', '\0'};

Here's what an implementation of strlen() might look like (off the top of my head -- don't have my K&R book handy, but I believe there's an implementation given there):

size_t strlen(const char* s)
{
    size_t result = 0;
    while (*s++) ++result;
    return result;
}


This

char y[4] = {'t','e','s','t'};

is not a proper zero-terminated string. It's an array of four characters, without the terminating '\0'. strlen() simply counts the characters until it hits a zero. With y it simply counts over the end of the array until it accidentally finds a zero byte.
Doing this you are invoking undefined behavior. The code might just as well format your hard drive.

You can avoid this by using the special syntax for character array initialization:

char y[] = "test";

This initializes y with five characters, since it automatically appends a '\0'.
Note that I also left the array's size unspecified. The compiler figures this out itself, and it automatically re-figures if I change the string's length.

BTW, here's a simple strlen() implementation:

size_t strlen(const char* p)
{
    size_t result = 0;
    while(*p++) ++result;
    return result;
}

Modern implementations will likely not fetch individual bytes or even use CPU intrinsics, but this is the basic algorithm.


The following is not a null terminated array of characters:

 char y[4] = {'t','e','s','t'};

Part of strlen()'s contract is that it be provided with a pointer to a null terminated string. Since that doesn't happen with strlen(y), you get undefined behavior. In your particular case, you get 6 returned, but anything could happen, including a program crash.

From C99's 7.1.1 "Definition of terms":

A string is a contiguous sequence of characters terminated by and including the first null character.


strlen works with strings. String is defined as a sequence (array) of characters terminated with \0 character.

Your x points to a string. So, strlen works fine with x as an argument.

Your y is not a string. For this reason, passing y to strlen results in undefined behavior. The result is meaningless and unpredictable.


You need to null-terminate y.

int get_len(char *string){

    printf("len: %lu\n", strlen(string));

    return 0;
}

int main(){

    char *x = "test";
    char y[5] = {'t','e','s','t','\0'};

    get_len(x); // len: 4
    get_len(y); // len: 4

    return 0;
}

strlen() basically takes the pointer you give it and counts the number of bytes until the next NULL in memory. It just so happened that there was a NULL two bytes later in your memory.


An actual C-type string is one bigger than the number of its characters, since it needs a terminating null character.

Therefore, char y[4] = {'t','e','s','t'}; doesn't form a string, since it's four characters. char y[] = "test"; or char y[5] = "test"; would form a string, since they'd have a character array of five characters ending with the null-byte terminator.


char y[5] = {'t','e','s','t','\0'};

would be the same as

char *x = "test"; 


As others have said, you just need to make sure to end a string with the 0 or '\0' character. As a side note, you may check this out: http://bstring.sourceforge.net/ . It has O(1) string length function, unlike the C/C++ strlen which is error prone and slow at O(N), where N is the number of non-null characters. I don't remember the last time when I used strlen and it's friends. Go for safe & fast functions/classes!


when you use the single quotes always use '/0' but in double quotes avoid to use the '/0' in strlen()

Note that the strlen() function doesn't count the null character \0 while calculating the length

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜