开发者

C# Regex remove line

I need to apply a regex in C#. The string looks like the following:

MSH|^~\&|OAZIS||C2M||20110310222404||ADT^A08|00226682|P|2.3||||||ASCII
EVN|A08
PD1
PV1|1|test

And what I want to do is delete all the lines that only contain 3 characters (with no delimiters '|'). So in this case, the 'PD1' line (3开发者_如何转开发rd line) has to be deleted. Is this possible with a regex?

Thx


The following will do what you want without regular expressions.

String inputString;
String resultingString = "";
for(var line in inputString.Split(new String[]{"\n"})) {
    if (line.Trim().Length > 3 || line.Contains("|"))
        resultingString += line + "\n";
}

This assumes that you have your file as one large string. And it gives you another string with the necessary lines removed.

(Or you could do it with the file directly:

string[] goodLines = 
    // read all of the lines of the file
    File.ReadLines("fileLocation").
        // filter out the ones you want
        Where(line => line.Trim().Length > 3 || line.Contains("|")).ToArray();

You end up with a String[] with all of the correct lines in your file.)


This: (?<![|])[^\n]{4}\n Regex matched what you wanted in the online regex tester I used, however I believe that the {4} should actually be a {3}, so try switching them if it doesn't work for you.

EDIT:

This also works: \n[^|\n]{3}\n and is probably closer to what you are looking for.

EDIT 2:

The number is brackets is definitely {3}, tested it at home.


why not just get a handle to the file, make a temporary output file, and run through the lines one by one. If there is a line with 3 characters, just skip it. If the file can be held in memory entirely, then maybe use the GetLines() (i think that's what the method is called) to get an array of strings that represents the file line by line.


Are the three characters always going to be by themselves on a line? If so, you can use beginning of string/end of string markers.

Here's a Regex that matches three characters that are by themselves on a string:

\A.{3}\z

\A is the start of the string. \z is the end of the string. . is any character, {3} with 3 occurrences


^ - start of line. \w - word character {3} - repreated exactly 3 times $ - end of line

^\w{3}$


Just a general observation from the solutions I've seen posted so far. The original question included the comment "delete all the lines that only contain 3 characters" [my emphasis]. I'm not sure if you meant literally "only 3 characters", but in case you did, you may want to change the logic of the proposed solutions from things like

   if (line.Trim().Length > 3 ...)

to

   if (line.Trim().Length != 3 ...)

...just in case lines with 2 characters are indeed valid, for example. (Same idea for the proposed regex solutions.)


This regex will identify the lines that meet your exclusion criteria ^[^|]{3}$ then it's just a matter of iterating over all lines (with data) and checking which ones meet exclusion criteria. Like this for instance.

foreach(Match match in Regex.Matches(data, @"^.+$")
{
  if (!Regex.IsMatch(match.Value, @"^[^|]{3}$"))
  {
     // Do Something with legitamate match.value like write line to target file.
  }
}


The question is a little vague.

As stated, the answer is something like this

(?:^|(?<=\n))[^\n|]{3}(?:\n|$) which allows whitespace in the match.
So "#\t)" will also be deleted.

To limit the characters to visual (non-whitespace), you could use
(?:^|(?<=\n))[^\s|]{3}(?:\n|$)
which doesent allow whitespace.

For both the context is a single string, replacement is '' and global.
Example context in perl: s/(?:^|(?<=\n))[^\n|]{3}(?:\n|$)//g


try this:

text = System.Text.RegularExpressions.Regex.Replace(
        text, 
        @"^[^|]{3}(?:\r\n|[\r\n]|$)", 
        "", 
        System.Text.RegularExpressions.RegexOptions.Multiline);


You can do it Using Regex

string output = Regex.Replace(input, "^[a-zA-Z0-9]{3}$", "");

[a-zA-Z0-9] will match any character or number {3} will match exact number of 3

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜