开发者

String Uppercase in java pattern

Everyone, I've been try to solve this since yesterday.

What is the representation of pattern that contain (A-Z)* and (\\p{Punct})* and (0-9)* and (\\s), and all char of the pattern were Uppercase.

i.e,

  • PATTERN {001}

  • OTHERS PATTERN (002-005)

edit : just moment ago, i got this patter for question above:

(([A-Z])*|(\\p{Punct})*|([0-9])*|(\\s)*)*

the new problem is getting the uppercase sub String from some String which separated with "|":

then, I used code look like below :

            String theString = "";
            String theUppercase = "";
            Pattern level5Patter = Pattern.compile("(([A-Z])*|(\\p{Punct})*|([0-9])*|(\\s)*)*\\|");
            Matcher level5Matcher = level5Patter.matcher(strFileContent);
        开发者_运维知识库    while(level5Matcher.find()){
                String resultLevel5 = level5Matcher.group();
                if(resultLevel5.toUpperCase().equals(resultLevel5)){
                    System.out.println(resultLevel5);
                }
                else{
                    theString=theString+resultLevel5;
                }
            }

the sub string will look like below :

TITLE OF THIS DATA IS ALWAYS UPPERCASE AND SOMETIME CONTAIN NUMERIC 1.0.0.0.0 EVEN PUNCTUATION {}

The String source is look like below:

Head 1|Head 1.0|Head 1.0.0|Head 1.0.0.0|TITLE OF THIS DATA IS ALWAYS UPPERCASE AND SOMETIME CONTAIN NUMERIC 1.0.0.0.0 EVEN PUNCTUATION {}|first data description sometime contains UPPERCASE and numeric 1010 and punctuation {}|01234|Head 1|Head 1.0|Head 1.0.0|Head 1.0.0.1|TITLE OF THIS DATA IS ALWAYS UPPERCASE AND SOMETIME CONTAIN NUMERIC 1.0.0.1.0 EVEN PUNCTUATION|first data description sometime contains UPPERCASE and numeric 1010 and punctuation {}|56789|

Thanks in advance.


Create a character class and put everything in, that you want to allow

Pattern p = Pattern.compile("^[A-Z0-9\\p{P}\\s]+$");

[A-Z0-9\\p{P}\\s] this is a character class, that allows A-Z, 0-9, Punctuation and Whitespace.

^ is an anchor for the start of the string

$ is an anchor for the end of the string

+ is a quantifier that allows

A more unicode approach would be

^[\\p{Lu}\\p{N}\\p{P}\\s]+$

\\p{Lu} an uppercase letter that has a lowercase variant.

\\p{N} any kind of numeric character in any script.

See here on regular-expressions.info for more information


I must admit it is not entirely clear to me what you are asking. Could you try to rephrase your question?

Under the assumption that you are trying to combine some character classes, in other words, you want a pattern that accepts any string consisting of any sequence of characters from the character classes '[A-Z]', '\p{Punct}' and '[0-9]', this would become something like: '([A-Z0-9]|\p{Punct})*'. Beware of double escaping when encoding this as a String:

Pattern p = new Pattern("([A-Z0-9]|\\p{Punct})*");
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜