Regex for "111 -> c:\my source\file1.cpp (no code)" (C#)
I need to parse a string with the following content开发者_Go百科s in C#:
111 -> c:\my source\file1.cpp (no code)
112 -> c:\my source\file1.cpp
113 -> c:\my source\file2.cpp
114 -> c:\my source\file3.cpp
115 -> c:\my source\file2.cpp (no code)
I need to get the first number and the files names, but only for records with code (so there should not be (no code) at the end. Currently I have ended up with this rexex
new Regex(@"^(\d+) -> ([^\r\n]*)", RegexOptions.Multiline | RegexOptions.IgnoreCase)
It is really simple, but it gives me lines, that I don't want to see.
All my tries to write something like ^(\d+) -> ([^\r\n]*)(?! \(no code\))
failed.
Actually, this may be a more generic example. Like:
How to match BBB in a string of form "aaa BBB ccc", where BBB can be any set of charaters, and aaa and ccc are known tokens, that consist of the same set of character as in BBB?
Why can't you just use:
^(\d+) -> ([\w:\\\s.]+)$
Apply multi-line and it won't allow (no code)
as it's not contained in the last group (no parenthesis allowed in last group's class)
demo
If you do need to permit parenthesis in the file name, you can use something like:
^(\d+) -> (.+?)(?<! \(no code\))$
Which uses a negative look-behind instead (so you can make sure it doesn't come before the end of line).
demo
I tested this with c# and it its working for me.
new Regex(@"^(\d+)\s->\s(.+\.\w+)(?!.*\(no code\))$", RegexOptions.Multiline | RegexOptions.IgnoreCase);
It's not that different from your try,
^(\d+) -> ([^\r\n]*)(?! \(no code\))
but I think your middle part ([^\r\n]*)
matches to much, so that the negative lookahead will not match anymore.
Update:
I tested @Brad Christie 's solution
new Regex(@"^(\d+) -> (.+?)(?<! \(no code\))$", RegexOptions.Multiline | RegexOptions.IgnoreCase);
and it is also working with .net/c# in my environment, so +1
精彩评论