Named capture Regex with two variants
I have been struggling with this all morning. Hoping some regex gurus can steer me in the right direction. Basically, i'm using regex's to compare two string values. The same regex should be applied to both strings, and if all values in all named groups match, we consider the string equivalent (this named group check is done in code).
For the strings, i have something like "jw-开发者_运维问答cst" which needs to be compared to "cst". The regex I need should consider these equivalent, as anything before and including the '-' should not be captured in the named group.
So...
jw-cst -> capture group value = "cst" cst -> capture group value = "cst"
The name of the capture group is irrelevant, the app i'm working with simply loops through each group captured makes sure they match for both results.
So far I have this:
(?(?<=.-).|.*)
But it seems to be using the second match condition...so always returns "jw-cst" instead of just "cst". If I remove the second alternative (.*), it will match correctly... Any help would be greatly appreciated.
You could use this regex:
^(?:\w+-)?(\w+)$
and apply it to both strings. Capture group 1 should then contain an identical string.
This doesn't impose any restriction on string length, and it allows alphanumeric characters; if you only want to allow 2 to 3 ASCII characters for example, you could use
^(?:[A-Z]{2,3}-)?([A-Z]{2,3})$
If you give information about which programming language (and therefore which regex engine) you're using, I might have more tips for you.
In .NET, you could also use:
(?<=^(?:\w+-)?)\w+$
That way, the entire match will only consist of the "second" part.
精彩评论