What encoding do strings found in the Mach-O __DATA segment, __cfstring section use?
I'm wondering how to properly read strings from a开发者_运维百科 specific section of a Mach-O binary. (This is a binary for iOS.)
I'm curious about the strings found in the __DATA
segment, __cfstring
section. These sections appear to contain arrays of simple structures:
NSConstantString
{
Class class;
const char *string;
int length;
}
The question comes down to: how do you decide the encoding of the string
?
It's described in the source of CFString
available here. It's either in ASCII or UTF16 (in the processor endian-ness.)
Also see the source code of clang
, available here. Look for GenerateConstantString
. Constant strings are eventually generated by this piece of code, look for GetAddrOfConstantCFString
. The source code says that the constant CFString is of the format
struct __builtin_CFString {
const int *isa; // point to __CFConstantStringClassReference
int flags;
const char *str;
long length;
};
(at least on OS X, I'm not sure about iOS.) flags
tells you whether it's ASCII or UTF16.
精彩评论