Reading and outputting UTF-8 strings in c/cocoa

2022-12-17 15:37 问答作者：

In an objective-c/cocoa app, I am using c functions to open a text file, read it line-by-line and use some lines in a third-party function. In psuedo-code:

char *line = fgets(aFile);
library_function(line);  // This function calls for a utf-8 encoded char * string

This works fine until the input file contains special characters (such as accents or the UTF-8 BOM) whereupon the library function outputs mangled characters.

However, if I do this:

char *line = fgets(aFile);
NSString *stringObj = [NSString stringWithUTF8String:line];开发者_开发百科
library_function([stringObj UTF8String]);

Then it all works fine and the string is outputted correctly.

What is that [NSString... line doing that I'm not? Am I doing something wrong with how the line is fetched initially? Or is it something else entirely?

UTF-8 is a multi-byte character set (see wikipedia), which means some characters require multiple bytes (the accented ones you've run into). C's char type is a single byte, so C's definition of "character" doesn't match Unicode's.

If you want to read Unicode with the standard C RTL, you'll also need to use a Unicode conversion library, such as libiconv.

(Using wchar_t may also work; I've never researched it.)

Or you can use NSString, which already supports Unicode.

继续阅读：c cocoa fgets utf-8

Reading and outputting UTF-8 strings in c/cocoa

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？