开发者

Find any literal with a Regular Expression

in my C# program, I have a regular expression textparser, that finds all occurrences of words that are s开发者_JS百科urrounded by double squared brackets. For instance, [[anything]] would find the word anything.

In a second step, I want to count how often the found word (in my example: anything) appears in the whole text. To do this, I try to create a RE that contains the found word and count, how many matches I get. Problem is, that the found word can also contain special chars and the following regex:

string foundWord = "(anything";
Regex countOccurences = new Regex(foundWord);

will obviously fail when the variable contains special chars like '('. Expresso suggests for matching whole expressions the following construct:

Regex countOccurences = new Regex("(?(" + foundWord + ")Yes|No)");

but when in this scenario foundWord is a number, like '2009', the RE tries to interpret it as a reference to a group (which is obviously not defined). In my text, there can be any combination of normal chars, special chars, numbers etc.

How can I tell the RE to interpret the given string as literal expression only?

Thanks in advance, Frank


You should escape the literal before building the regular expression with it, using Regex.Escape

Something like:

Regex countOccurances = new Regex(Regex.Escape(foundWord));

However, since all you're doing is counting occurances, a better option is to avoid using a regular expression for the second search at all. Since you don't care about any special characters, it would be easier just to do a plain text search.


if you're just trying to count the number of occurences of a string, why use a regex at all? Just use your basic string libraries, contains(), indexOf(), whatever makes most sense in C#. But if you don't need the fancy functionality of a regex, why use a regex? I think

int position = string.indexOf(foundString);
while(position != -1)
{
    count++;
    position = string.indexOf(foundString, position + 1);
}

would accomplish it without regexes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜