开发者

Understanding compiled regex in .net

I have a regex that will be used repetitively where the stringLiteral will vary from one invocation to the next.

One being:

.*(^stringLiteral Number 1\r?\n)([\w|\s][^\r\n]+)(.+)

and the next being:

.*(^stringLiteral Number 2\r?\n)([\w|\s][^\r\n]+)(.+)

Is there a chance for optimization here?

EDIT: To be a bit more explicit about the live data I'm working against - I'm parsing the body an email that contains name/value pairs. I know the names (labels) and i know that the value i'm after is the line that follows the label. But I can't be sure that the name/value pairs (lines) will always fall in the same order - so I can't build one large expression.

I have to build multiple expressions the discard everything from the beginning of the block to and including the given label (this would be the stringLiteral); capture the next line into a capture group; then discard everything following that line.

so this line capture the Name field

myOrder.Name = Regex.Replace(resultString, @".*(^Name\r\n)([\w|\s][^\r\n]+)(.+)", "$2", RegexOptions.Multiline | RegexOptions.Singleline);

and this line capt开发者_运维知识库ures the price field

myOrder.Price= Regex.Replace(resultString, @".*(^Price\r\n)([\w|\s][^\r\n]+)(.+)", "$2", RegexOptions.Multiline | RegexOptions.Singleline);


Well, you could condense them into a single expression if you want to:

.(^stringLiteral Number [12]\r?\n)([\w|\s][^\r\n]+)(.+)

If you post an example of the input you want to match or capture I could probably help some more.


You can condense them into a single expression as suggested by Andrew.

You should also disable backtracking where it's not needed, e.g.: (?:subregexp) instead of (subregexp). Doing so saves memory.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜