Find and replace * in strings using regular expressions in C#
I have a large X12 EDI file with many description strings (1000s). These description strings can be found before, after and between other strings that have the same delimiter of *
.
All description strings start with the tag REF*TC**
and end with the character ~
.
I need to find and replace all *
that occur between these two tags, without touching the other strings, in this example the DTM string.
I am including an example of two description strings as they would be found in the file. As you can see, the first description string contains the *
that I'm needing to replace, the second description string doesn't contain any *
that are needing to be replaced.
~REF*TC**BLAH*BLAH*~REF*TC**BLAHBLAH~REF*TC***BLAH~DTM*01开发者_如何学C0*20110329~
desired output:
~REF*TC**BLAHBLAH~REF*TC**BLAHBLAH~REF*TC**BLAH~DTM*010*20110329~
I am using C#
This is what I have so far.
find expression: REF\*TC\*\*(.{0,}?)(\*+)(.{0,}?)(\**)(.{0,}?)(\**)~
Here's what I've come up with:
var str = "~REF*TC**BLAH*BLAH*~REF*TC**BLAHBLAH~REF*TC***BLAH~DTM*010*20110329~";
var result = (new Regex(@"(?<pre>REF\*TC\*\*)(?<text>.*?)(?<post>~)")).Replace(str,(m) =>
{
return String.Join(String.Empty,new String[]{
m.Groups["pre"].Value,
m.Groups["text"].Value.Replace("*",String.Empty),
m.Groups["post"].Value
});
});
DEMO
That's just based on what you've provided, not 100% sure what you're going for though, to be honest.
Regex is awesome, but as the famous quote goes, Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Skip the regex and just use string methods on it. You could go as simple as splitting it on the REF*TC**
start tags and then replacing all the *
characters, or you could try for something more sophisticated. Don't go all the way for regex when simple string methods will do.
EDIT:
Here's a real simple example:
string[] lines = file.Split("REF*TC**");
for(int i=0;i<lines.Length;i++)
{
lines[i] = lines[i].Replace("*", "");
}
string output = string.Join("REF*TC**", lines);
You may have to clean up an extra "REF*TC**" at the end, I don't remember exactly how Join() handles it. Anyways, that should do it.
精彩评论