Matcher.Find() returns false when it should be true
String s = "test";
Pattern pattern = Pattern.compile("\\n((\\w+\\s*[^\\n]){0,2})(\\b" + s + "\\b\\s)((\\w+\\s*){0,2})\\n?");
Matcher matcher = pattern.matcher(searchableText);
boolean topicTitleFound = matcher.find();
startIndex = 0;
while (topicTitleFound) {
int i = searchableText.indexOf(matcher.group(0));
if (i > startIndex) {
builder.append(documentText.substring(startIndex, i - 1));
...
This is the text that I tacle:
Some text comes here
topicTitle test :
test1 : testing123 test2 : testing456 test3 : testing789 test4 : testing9097
When I'm testing this regex on http://regexpal.com/ or http://www.regexplan开发者_Python百科et.com I clearly find the title that is saying: "topicTitle test". But in my java code topicTitleFound returns false.
Please help
It could be that you have carriage-return characters ('\r'
) before the newline characters ('\n'
) in your searchableText
. This would cause the match to fail at line boundaries.
To make your multi-line pattern more robust, try using the MULTILINE option when compiling the regex. Then use ^
and $
as needed to match line boundaries.
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Update:
After actually testing out your code, I see that the pattern matches whether carriage-returns are present or not. In other words, your code "works" as-is, and topicTitleFound
is true
when it is first assigned (outside the while
loop).
Are you sure that you are getting false
for topicTitleFound
? Or is the problem in the loop?
By the way, the use of indexOf()
is wasteful and awkward, since the matcher already stores the index at which group 0 begins. Use this instead:
int i = matcher.start(0);
Your regex is a bit hard to decrypt - not really obvious what you're trying to do. One thing that springs to mind is that your regex expects the match to start with a newline, and your sample text doesn't.
精彩评论