开发者

difference fixed width strings and zero-terminated strings

gcc 4.4.4 c89

I got into a recent discussion about "fixed width strings" and "zero terminated strings".

When I think about this. They seem to be the same thing. A string with a terminating null.

i.e.

char *name = "Joe bloggs";

Is a fixed width string that cannot be changed. And also has a terminating null.

Also in the discussio开发者_如何学运维n I was told that strncpy should never been used on 'zero terminated strings'.

Many thanks for any susgestions,


The term "fixed-width string" usually refers to something completely different.

A fixed-width string of with N is a string of exactly N characters, where all N characters are guaranteed to be initialized. If you want to represent a shorter string, you have to pad your string with zero characters at the end. You have to add as many zero characters as necessary to use up all N characters. Note, that if you need to store a string of length exactly N, a fixed-width string will have no zero character at the end. I.e. in general case fixed-width strings are not zero terminated!

What is the purpose of this? The purpose of this is to save 1 character when storing the string of maximum possible length. If you are using fixed width strings of width N, then you need exactly N characters to represent a string of length N. Compare that to ordinary zero-terminated strings, which would require N + 1 character (extra character for zero terminator).

Why is it padded with zeros at the end? It is padded with zeros to simplify lexicographic comparison of fixed-width strings. You simply compare all N characters until you hit the difference. Note, that one can use absolutely any character to pad the fixed-width string to full length. Just make sure that you get the right lexicographic ordering. Using zero character for padding is a good choice though.

When is it useful? Very rarely. The savings provided by fixed-width strings are rarely important in generic string processing: these saving are too small and only occur in cases when the full width is used by the string. But they might come useful is some specific cases.

Where does all this come from? A classic example of a "fixed-width string" is a 14-char wide file name field in some old version of Unix file system. It was represented by an array of 14 chars and fixed width representation was used. At that time saving 1 character on full-length (all 14 characters) file name was important.

Now to strncpy. Function strncpy was specifically introduced for initializing those 14-character wide file name fields in that file system. Function strncpy was specifically created to generate a valid fixed-width string: it performs conversion of zero-terminated string into a fixed-width string. Unfortunately, it was given a misleading name, which is the reason why many people today mistake it for a "safe" copying function for zero-terminated strings. The latter is a totally incorrect understanding of strncpy purpose and functionality.

Using string literals to represent fixed-width strings (as in your example) is not a good idea, since string literals always add a zero character at the end, and fixed-width strings don't necessarily do it. This is how a bunch of fixed width strings can be initialized in a C program

char fw_string1[7] = { 'T', 'h', 'i', 's', ' ', 'i', 's' };
char fw_string2[7] = { 's', 't', 'r', 'i', 'n', 'g' };
char fw_string3[7] = { 'H', 'e', 'l', 'l', 'o' };

All arrays have the same number of elements - 7. Note, that the first string is not zero-terminated, while the rest are zero-padded. Conversion of "ordinary" string into a fixed-width one will look as follows

char fw_string4[7];

strncpy(fw_string4, "Hi!", 7);

In this case function strncpy is used exactly what it was intended to be used for.

Keep in mind also, that aside from the conversion function strncpy, standard library provides virtually no means for working with fixed-width strings. You basically have to treat them as raw character arrays, and implement any higher-level operations manually. Most basic operations will be naturally implemented by functions from mem... group. memcmp, for one example, will implement comparison.

P.S. Actually, taking into account caf's comment, in C language one can use string literals to initialize fixed-width strings, since C language allows the literal initializer to be one character longer then array (i.e. in C it is OK, if the terminating zero does not fit into the array). So, the above can be equivalently rewritten as

char fw_string1[7] = "This is";
char fw_string2[7] = "string";
char fw_string3[7] = "Hello";

Note that fw_string1 is still not zero-terminated in this case.


First of all, I think you mean fixed length string, not fixed with string.

Second, the above is a null terminated string. It shouldn't be changed because of its definition as a literal constant.

AFAIK C doesn't have any real "fixed length strings". At best, you could define a buffer of size N and place no more than N-1 characters in it, where placing more would be an error and forgetting the null terminator can be an error.

As for strncpy, what it does is that it copies the specified number of characters, and zero pads the rest. This means that if the destination is not long enough, you would either be writing past the available space, or would not have a null terminator to your string, leading to errors when you attempt to use the string.


I am not quite sure about the term "fix width string". Depending on C function strings need or don't need an ending \0. Functions like strlen and strcpy need to work on \0 terminated strings in order to know when to stop. Functions like strncpy don't need the source string to be \0-terminated since one argument tells how many characters to copy.

When you declare name as you do the contents of what name points to is stored in read-only memory and can not be modified however you can use 'name' in C functions that do not modify the contents e.g. strlen(name) or when used as a source:

char mycopy[32];
strcpy( mycopy, name );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜