开发者

How do I match non-ASCII characters with RegexKitLite?

I am using RegexKitLite and I'm trying to match a pattern.

The following regex patterns do not capture my word that includes N with a titlde: ñ. Is there a string conversion I am missing?

subjectString = @"define_añadir";
//regexString = @"^define_(.*)"; //this pattern does not match, so I assume to add the ñ     
//regexString = @"^define_([.ñ]*)"; //tried this pattern first with a range
regexString = @"^define_((?:\\w|ñ)*)"; //tried second

NSString *captured= [subjectString stringByMatching:regexString capture:1L];
//I开发者_JAVA技巧 want captured == añadir


Looks like an encoding problem to me. Either you're saving the source code in an encoding that can't handle that character (like ASCII), or the compiler is using the wrong encoding to read the source files. Going back to the original regex, try creating the subject string like this:

subjectString = @"define_a\xC3\xB1adir";

or this:

subjectString = @"define_a\u00F1adir";

If that works, check the encoding of your source code files and make sure it's the same encoding the compiler expects.

EDIT: I've never worked with the iPhone technology stack, but according to this doc you should be using the stringWithUTF8String method to create the NSString, not the @"" literal syntax. In fact, it says you should never use non-ASCII characters (that is, anything not in the range 0x00..0x7F) in your code; that way you never have to worry about the source file's encoding. That's good advice no matter what language or toolset you're using.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜