How do I count odd and even amounts of characters with regular expressions?
I'm trying to pull out all strings which have an even number of B's and an odd number of C's. I have the regexes to match odd A's and even B's but I cannot get the two to work together. The strings are delimited by whitespace (tabs, newlines, spaces).
e.g.
XABBAC ABCDEBCC ABSDERERES ABBAAJSER HGABAA
I have for odd A's
\b[^A]*A([^A]*A[^A]*A)*[^A]*\b
And for even B's
\b[^B]*(B[^B]*B[^B]*)*[^B]*\b
开发者_JAVA百科
I know I need to use +ve lookahead and have tried:
\b(?=[^A]*A([^A]*A[^A]*A)*[^A]*\b)[^B]*(B[^B]*B[^B]*)*[^B]*\b
but it doesn't work - does anybody know why?
The problem is that your regexes (regexen?) can match zero characters - \b\b
will match on a single word boundary, and so will \b{someregexthatcanmatchzerocharacters}\b
.
As Anon already mentioned: your pattern matches empty strings, causing m.find()
to never advance in the target string. So, you need to let your even B
's actually match Strings containing 2, 4, 6, ... number of B
's. If you want, you can alternate between an even number of B
's and this: [^B\\s]+
(which matches Strings containing 0 B
's). As long as you actually match one or more character with it, then you should be okay.
Also, you don't want to look ahead and let the negated classes match spaces: that way you get too much matches.
Try something like this:
String text = "XABBAC ABCDEBCC ABSDERERES ABBAAJSER HGABAA";
String oddAs = "\\b[^A\\s]*A([^A\\s]*A[^A\\s]*A)*[^A\\s]*\\b";
String evenBs = "\\b([^B\\s]*(B[^B\\s]*B[^B\\s]*)+|[^B\\s]+)\\b";
Pattern p = Pattern.compile(String.format("(?=%s)(?=%s)\\S+", oddAs, evenBs));
Matcher m = p.matcher(text);
while (m.find()) {
System.out.println(m.group());
}
which produces:
ABCDEBCC
ABBAAJSER
With commons.lang.StringUtils it's even more concise:
String data = "XABBAC ABCDEBCC ABSDERERES ABBAAJSER HGABAA";
String[] items = data.split("\\s+");
for(String item: items ) {
if (countMatches(item, "B") % 2 == 0
&& countMatches(item, "C") % 2 != 0) {
System.out.println( item );
}
}
regex is overrated
String str = "XABBAC ABCDEBCC ABSDERERES ABBAAJSER HGABAA";
String[] s = str.split("\\s+");
for (int j=0 ;j< s.length;j++) {
int countC=0 ;
int countB=0;
for(int i=0;i<s[j].length();i++){
char c = s[j].charAt(i) ;
if (c == 'C') countC++;
if (c == 'B') countB++;
}
if ( (countC % 2) != 0 )
System.out.println( s[j] + " has odd C");
if ( (countB % 2) == 0 )
System.out.println( s[j] + " has even B");
}
精彩评论