Objective c doesn't like my unichars?

2022-12-18 12:04 问答作者：

Xcode complaints about "multi-character character contant"'s when I try to do the following:

static unichar accent characters[] = { 'ā', 'á', 'ă', 'à' };

How do you make an array of charac开发者_Python百科ters, when not all of them are ascii? The following works just fine

static unichar accent[] = { 'a', 'b', 'c' };

Workaround

The closest work around I have found is to convert the special characters into hex, ie this works:

static unichar accent characters[] = { 0x0100, 0x0101, 0x0102 };

It's not that Objective-C doesn't like it, it's that C doesn't. The constant 'c' is for char which has 1 byte, not unichar which has 2 bytes. (see the note below for a bit more detail.)

There's no perfectly supported way to represent a unichar constant. You can use

char* s="ü";

in a UTF-8-encoded source file to get the unicode C-string, or

NSString* s=@"ü";

in a UTF-8 encoded source file to get an NSString. (This was not possible before 10.5. It's OK for iPhone.)

NSString itself is conceptually encoding-neutral; but if you want, you can get the unicode character by using -characterAtIndex:.

Finally two comments:

If you just want to remove accents from the string, you can just use the method like this, without writing the table yourself:

-(NSString*)stringWithoutAccentsFromString:(NSString*)s
{
    if (!s) return nil;
    NSMutableString *result = [NSMutableString stringWithString:s];
    CFStringFold((CFMutableStringRef)result, kCFCompareDiacriticInsensitive, NULL);
    return result;
}

See the document of CFStringFold.

If you want unicode characters for localization/internationalization, you shouldn't embed the strings in the source code. Instead you should use Localizable.strings and NSLocalizedString. See here.

Note: For arcane historical reasons, 'a' is an int in C, see the discussions here. In C++, it's a char. But it doesn't change the fact that writing more than one byte inside '...' is implementation-defined and not recommended. For example, see ISO C Standard 6.4.4.10. However, it was common in classic Mac OS to write the four-letter code enclosed in single quotes, like 'APPL'. But that's another story...

Another complication is that accented letters are not always represented by 1 byte; it depends on the encoding. In UTF-8, it's not. In ISO-8859-1, it is. And unichar should be in UTF-16. Did you save your source code in UTF-16? I think the default of XCode is UTF-8. GCC might do some encoding conversion depending on the setup, too...

Or you can just do it like this:

static unichar accent characters[] = { L'ā', L'á', L'ă', L'à' };

L is a standard C keyword which says "I'm about to write a UNICODE character or character set".

Works fine for Objective-C too.

Note: The compiler may give you a strange warning about too many characters put inside a unichar, but you can safely ignore that warning. Xcode just doesn't deal with the unicode characters the right way, but the compiler parses them properly and the result is OK.

Depending on your circumstances, this may be a tidy way to do it:

NSCharacterSet* accents = 
    [NSCharacterSet characterSetWithCharactersInString:@"āáăà"];

And then, if you want to check if a given unichar is one of those accent characters:

if ([accents characterIsMember:someOtherUnichar])
{
}

NSString also has many methods of its own for handling NSCharacterSet objects.

继续阅读：gcc objective-c xcode

Objective c doesn't like my unichars?

Workaround

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Workaround

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？