开发者

regex problem with Ctrl-M

I want to macth the following:

boolean b = "\u000D".matches("\\cM");

but the compiler give me:

unclosed string literal
illegal character: \92
illegal character: \92
unclosed string literal
not a statement

why? that literal is not a valid unicode Ctrl-m 开发者_StackOverflowunicode code???


The problem of unclosed string literal is because the \uXXXX sequences are resolved before lexing. So

boolean b = "\u000D".matches("\\cM");

becomes

boolean b = "
".matches("\\cM");

which is invalid Java code. (yes it also means you could write String foo = \u0022\u0021\u0022; and compiles correctly.)

If you write instead

boolean b = "\r".matches("\\cM"); // \r == \u000D

then the code works and return true.


Haha !

This is a trap!

Java processes Unicode escapes before interpretation. So, it converts you code into:

boolean b = "
".matches("\\cM"); 

.. and so, it is definitely an error - incompleted string and so on.


This might be unrelated, but I wanted to remove Ctrl + m from a field in database (Vertica).

I used below function and it worked for me.

REGEXP_REPLACE(<column_name>,'[[:cntrl:]]')
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜