utf8 strings and malloc in c
With "opendir" and "readdir" i do read a directories content. During that process i do some strings manipulation / allocation: something like that:
int stringlength = strlen(cur_dir)+s开发者_如何转开发trlen(ep->d_name)+2;
char *file_with_path = xmalloc(stringlength); //xmalloc is a malloc wrapper with some tests (like no more memory)
snprintf (file_with_path, (size_t)stringlength, "%s/%s", cur_dir, ep->d_name);
But what if a string contains a two-byte utf8 char? How do you handle that issue?
stringlength*2?
Thanks
strlen()
counts the bytes in the string, it doesn't care if the contained bytes represent UTF-8 encoded Unicode characters. So, for example, strlen()
of a string containing an UTF-8 encoding of "aöü" would return 5
, since the string is encoded as "a\xc3\xb6\xc3\xbc"
.
strlen
counts the number of bytes in a string (up to the terminating NUL), not the number of UTF-8 characters, so stringlength
should already be as large as you need it.
精彩评论