C# Regex Replace Pattern (Replace String) Return $1
I'm currently working with parsing some data from SQL Server and I'm in need of help with a Regex.
I have an assembly in Sql Server 2005 that helps me Replace strings using C# Regex.Replace() Method.
I need to parse the following.
Strings:
CAD 90890
(CAD 90892)
CAD G67859
CAD 34G56
CAD 3S56.
AX CAD 890990
CAD 783783 MX
Needed Results:
90890
90892
G67859
34G56
3S56
890990
783783
SELECT TOP 25 CADCODE, dbo.RegExReplace(CADCODE, '*pattern*', '$1')
FROM dbo.CADCODES
WHERE开发者_JAVA技巧 CADCODE LIKE '%CAD%'
I need to get the proceeding string after the CAD word until it hits a white-space or anything that not a number or digit. I managed to get the digits but it really fails on others. I'm trying to get it to work but I can't find a real solution.
Thanks in advance.
Updated to reflect new Strings
AX CAD 890990
CAD 783783 MX
Try this:
(\w+)\W*$
The pattern matches the last word - made of alphanumeric (and underscores).
Example: http://www.rubular.com/r/1zWQQVLZy1
Another option is to find a word with at least one digit - this one can match anywhere on the string, so you may need to handle multiple matches. In this case, you can add a capturing group around the whole pattern, or replace using $&
.
[a-zA-Z_]*\d\w*
Example: http://www.rubular.com/r/XUrFNuPQUv
If you can't match (Regex.Match
) and must use Regex.Replace
, you can match the entire string start to end and replace it with the group you need:
RegExReplace(CADCODE, '^.*\b([a-zA-Z_]*\d\w*)\b.*$', '$1')
I think this is what you're after:
^\W*\w*CAD\w*\W*(\w+)\W*$
The regex has to match the whole string so RegExReplace can replace it with $1
, effectively stripping off the unwanted parts.
EDIT: Let me back up and make sure I've got this right. Because of the
WHERE CADCODE LIKE '%CAD%'
in your query, you already know every string contains the sequence CAD
. That being the case, there's no need to complicate the regex by matching that sequence again. This should be all you need:
^.*?(\w+)\W*$
Try this:
(?:\(CAD\)|CAD)\s+?([\dA-Z]+)
You can get the result from the capture group number 1
.
The problem with regex is that it's always easy to get a good pattern if you have a limited sample set.
In your case, you use: \w{4}\w*
which just says, 4 alphanumerics, followed by 0 or more alphanumerics, so all the CAD sections would not match, nor would spaces or ().
精彩评论