开发者

Multifunction RegEx for parsing JCL variables - out of working solutions

I'm a bit lost creating a RegEx under C#.NET.

I'm doing something like parser, so I use Regex.Replace to search text for certain "variables" and replace them with their "values".

Each variable starts with ampersand ("&") and ends with ampersand (begining of another variable) or dot.

Each variable (as well as text surrounding variables) can only consist of alphanumerical characters and certain "special" characters, that being "$", "@", "#" and "-".

Nor variables, nor the rest of the text could contain space characters (" ").

Now, the problem is that I'm trying to figure out a RegEx replacing one possible ending character ("."), while not replacing the other possible ending character ("&"). Which happanes to be quite an issue:

  • "&"+variable+"[^A-Za-z0-9#@$]" does what I want, except for it also replaces "&" - not acceptable.
  • "&"+variable+"(.)?\b" replaces dot, but only if followed by literal character - not if it's followed by \&\@#\$\- and that could occur, so this doesn't work either.
  • "&"+variable+"(.)?(?!A-Za-z0-9)" does exactly what i want as for the ending characters, except it doesn't recognize true end of variable - this way, search-and-replace for "&DEN" also replaces that part in another variable, called "&DENV" - of which "&DEN" is a substring. This would create false/misleading results - totally unacceptable.
These were all the possibilities I could think of (and search of); is it possible to do the task I require with one RegEx at all? Under C#.NET RegEx parser?

Just to illustrate desired function:

string variable="DEN";
string replaceWith="28";
string replText;
string r开发者_运维技巧egex = "<desired regex>";
replText = Regex.Replace(replText, "&"+variable+regex, replaceWith);

replText="&DEN";

=> replaced => repltext=="28"

replText="&DENV"    

=> not replaced => repltext=="&DENV"

replText="&DEN&DEN"    

=> replaced => repltext=="2828"

replText="&DEN&DENV"    

=> replaced, not replaced => repltext=="28&DENV"

replText="&DEN.anything"

=> replaced and dot removed => repltext=="28anything"

replText="&DEN..anything"

=> replaced and first dot removed => repltext=="28.anything"

variable could also be like "#DE@N-$".


The following works correctly on all of your examples. I assumed that a variable &FOO should only be replaced if it's followed by ., &, or end-of-string $. If it's followed by anything else, it's not replaced.

In order to match but not capture a terminating &, I used a lookahead assertion (?=&). Assertions force the string to match the regex, but they don't consume any characters, so those characters aren't replaced. Trailing . are still captured and replaced as part of the variable, however.

Finally, a MatchEvaluator is specified to use the captured pattern to do a lookup in the replacements dictionary for the replacement value. If the pattern (variable name) is not found, the text is effectively untouched (the full original capture is returned).

class Program
{
    static string ReplaceVariables(Dictionary<string, string> replacements, string input)
    {
        return Regex.Replace(input, @"&([\w\d$@#-]+)(\.|(?=&)|$)", m =>
        {
            string replacement = null;
            return replacements.TryGetValue(m.Groups[1].Value, out replacement)
                 ? replacement
                 : m.Groups[0].Value;
        });
    }

    static void Main(string[] args)
    {
        string[] tests = new[]
        {
            "&DEN", "&DENV", "&DEN&DEN",
            "&DEN&DENV", "&DEN.anything",
            "&DEN..anything", "&DEN Foo",
            "&DEN&FOO&DEN"
        };

        var replace = new Dictionary<string, string>
        {
            { "DEN", "28" },
            { "FOO", "42" }
        };

        foreach (var test in tests)
        {
            Console.WriteLine("{0} -> {1}", test, ReplaceVariables(replace, test));
        }
    }
}


Ok, I think I finally found it, using ORs. Regex
(.)?([^A-Za-z0-9#\@\$\&\,\;\:-\<>()\ ]|(?=\&)|\b)
seems to work fine. I'm just posting this if anyone found it helpfull.

EDIT: sorry, I haven't refreshed the page and thus reacted without knowing there is a better answer provided by Chris Schmich

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜