开发者

Java: How do I determine why a regular expression pattern match fails?

I am using a regular expression to match whether or not a pattern matches, but I also want to k开发者_如何学编程now when it fails.

For example, say I have a pattern of "N{1,3}Y". I match it against string "NNNNY". I would like to know that it failed because there were too many Ns. Or if I match it against string "XNNY", I would like to know that it failed because an invalid character "X" was in the string.

From looking at the Java regular expression package API (java.util.regex), additional information only seems to be available from the Matcher class when the match succeeds.

Is there a way to resolve this issue? Or is regular expression even an option in this scenario?


I guess you should use a parser, rather than simple regular expressions.

Regular Expressions are good providing matches for string, but not quite so in providing NON-matches, let alone explaining why a match failed.


It may work but I don't know if this is how you need it.

When you use matches, it fails if the whole sequence doesn't match, but you can still use find to see if the rest of the sequence contained the pattern and thus understand why it failed:

import java.util.regex.*;
import static java.lang.System.out;
class F { 
    public static void main( String ... args ) { 
        String input = args[0];
        String re = "N{1,3}Y";
        Pattern p = Pattern.compile(re);
        Matcher m = p.matcher(input);
        out.printf("Evaluating: %s on %s%nMatched: %s%n", re, input, m.matches() );
        for( int i = 0 ; i < input.length() ; i++ ) { 
           out.println();
           boolean found = m.find(i);
           if( !found ) { 
               continue;
           }
           int s = m.start();
           int e = m.end();
           i = s;
           out.printf("m.start[%s]%n"
                     +"m.end[%s]%n"
                     +"%s[%s]%s%n",s,e,
                     input.substring(0,s), 
                     input.substring(s,e), 
                     input.substring(e) );
        }

    }
}

Output:

C:\Users\oreyes\java\re>java F NNNNY
Evaluating: N{1,3}Y on NNNNY
Matched: false

m.start[1]
m.end[5]
N[NNNY]

m.start[2]
m.end[5]
NN[NNY]

m.start[3]
m.end[5]
NNN[NY]


C:\Users\oreyes\java\re>java F XNNY
Evaluating: N{1,3}Y on XNNY
Matched: false

m.start[1]
m.end[4]
X[NNY]

m.start[2]
m.end[4]
XN[NY]

In the first output: N[NNNY] you can tell there where too many N's, in the second: X[NNY] there was an X present.

Here's other output

C:\Users\oreyes\java\re>java F NYXNNXNNNNYX
Evaluating: N{1,3}Y on NYXNNXNNNNYX
Matched: false

m.start[0]
m.end[2]
[NY]XNNXNNNNYX

m.start[7]
m.end[11]
NYXNNXN[NNNY]X

m.start[8]
m.end[11]
NYXNNXNN[NNY]X

m.start[9]
m.end[11]
NYXNNXNNN[NY]X

The pattern is there but the whole expression didn't match.

It's a bit hard to understand how find, matches and lookingAt works from the doc ( at least this happened to me ) but I hope this example help you figure it out.

matches is like /^YOURPATTERNHERE$/

lookingAt is like /^YOURPATTERNHERE/

find is like /YOURPATTERNHERE/

I hope this helps.


What you are asking for would require that the parser determine a nearby string that actually matches your expression. This is a non-trivial problem that would probably run in exponential time (e.g. search all possible strings of similar length to find a match.)

So, in short, no.


For simple expressions like "N{1,3}Y", you will find the solution without tools yourself. But for more complicated expressions, my experience suggests:

  • split bigger expressions into smaller ones, and test them independently.
  • since you like to have a fast feedback, you can use an interactive shell like Beanshell, to test some Strings and patterns quickly, without big compiling, public static void main (bla...) and so on. Or try scala for this task. Sed is another powerful tool to use regular expressions, but there are subtle differences in the syntax, which can introduce new errors.
  • Often, the masking is a problem. Since Backslashes need another backslash, it can be an advantage to read the expression from a JTextField, where you don't need so much masking.
  • Write a small testing framework for your expressions, where you can easily put your expressions in, test strings, in, maybe produce automatic test data, and get visual feedback.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜