开发者

C: strange string phenomenon

could someone explain this phenomenon.

#include "stdio.h"
#include "stdlib.h"

int main()
{
    char foo[]="foo";
    char bar[3]="bar";
    printf("%s",foo);
    printf("\n");
    printf("%s",bar);
    return 0;
}

Result:

foo
barfoo

If I change the order and create bar before foo, I get a correct output.

#include "stdio.h"
#include "stdlib.h"

int main()
{
    char bar[3]="bar";
    char foo[]="foo";
    printf("%s",foo);
    printf("\n");
    printf("%s",bar);
    return 0;
}

Result:

foo
bar

And one more.

#include "stdio.h"
#include "stdlib.h"

int开发者_Python百科 main()
{

    char foobar[]="foobar";
    char FOO[3]={'F','O','O','\0'};
    char BAR[3]="BAR";
    printf("%s",foobar);
    printf("\n");
    printf("%s",FOO);
    printf("\n");
    printf("%s",BAR);
    return 0;
}

Result:

foobar
FOOfoobar
BARFOOfoobar


The string "bar" is four characters long: {'b', 'a', 'r', '\0'}. If you explicitly specify the array length then you need to allocate at least four characters:

char bar[4]="bar";

When you do this:

char bar[3]="bar";
printf("%s",bar);

You are invoking undefined behavior as the bar variable has no null terminator. Anything could happen. In this case specifically, the compiler has laid out the two arrays contiguously in memory:

'b' 'a' 'r' 'f' 'o' 'o' '\0'
 ^           ^
bar[3]      foo[4]

When you print bar it keeps reading until it finds a null terminator, any null terminator. Since bar has none it keeps going until it finds the one at the end of "foo\0".


If you declare char bar[3]="bar";, then you will declare a char array with no room for the null terminator. So printf() will just carry on reading chars from memory, printing them to the console, until it encounters a '\0'.


The other posters have already explained to you that in your

char bar[3] = "bar";

example the string terminator does not fit into the array, so the string ends up non-terminated. Formally speaking, it is not even a string (since stings are required to be terminated by definition). You are attempting to print a non-string as a string (using %s format specifier), which results in undefined behavior. Undefined behavior is exactly what you observe.

In C++ language (for example) the

char bar[3] = "bar";

declaration would be illegal, since C++ does not allow the zero terminator to "fall off" in a declaration like that. C allows it, but only for the implicit zero terminator character. The

char bar[3] = "barr";

declaration is illegal in both C and C++.

Again, the "missing zero" trick works in C with the implicit zero terminator character only. It doesn't work with any explicit initializer: you are not allowed to explicitly specify more initializers than there are elements in the array. Which brings us to your third example. In your third example you have

char FOO[3] = { 'F', 'O', 'O', '\0' };

declaration, which explicitly specifies 4 initializers for an array of size 3. This is illegal in C. Your third example is not compilable. If your compiler accepted it without a diagnostic message, you compiler must be broken. The behavior of your third program cannot be explained by C language, since it is not a C program.


As you hinted at knowing in your line char FOO[3]={'F','O','O','\0'}; this is a null termination issue. The problem is that the null terminator is a character. If you allocate memory for 3 characters, you can't put 4 characters in that location (it just takes the first 3 and truncates the rest).


you're missing the \0 at the end of the string.. and an array with 4 elements is declared as FOO[4] not FOO[3]..


In the first example you haven't got a null-terminated string. It just so happens that they are laid out in memory contiguously and thus the behaviour can be explained as the run over from one string to the other.

In the next example FOO is of size 3, but you are giving it four elements. In the same many BAR is not null terminated.

char FOO[3]={'F','O','O','\0'};
char BAR[3]="BAR";


bar is not null terminated, so the printf keeps following the array until it gets to a '\0' character. The stack is arranged such that bar and foo are right next to each other in memory. The only way C knows an array's size is by finding a null terminal. So if you laid out your stack in memory it would look like:

  0    1    2    3    4    5    6
 'b'  'a'  'r'  'f'  'o'  'o'  '\0'
  ^bar begins    ^foo begins

By saying foo[] the compiler sets the size of foo based on the constant string its initialized with. Its smart enough to make in 4 characters to include the null terminator, '\0'.

To solve this, the size of bar should actually be 4, ie:

 char bar[4] = "bar"; // extra space for null terminal

or better, let the compiler figure it out like you did with foo:

 char bar[] = "bar"; // compiler adds null term character('\0')


char bar[3]="bar"; doesn't leave enough space to add the terminating '\0' character.

If you do char bar[4]="bar";, You should get the result you expect.


Please read this detailed answer which will give insight into this...

The reason you're seeing "funny" things is because the strings are not NUL terminated...


The culprit is this line:

char bar[3]="bar";

This causes only 'b', 'a' and 'r' to be in the length 3 array that you have created.

Now as it so happens, the string in foo is 'f', 'o', 'o' and '\0' and it was allocated contiguous location with bar. So the memory looked like:

b | a | r | f | o | o | \0

I hope this makes it clear.


This line

char bar[3]="bar";

causes undefined behaviour as "bar" is four characters taking care of the '\0'. So you bar array should be four bytes.

undefined behaviour means anything can happen - including good and bad things

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜