开发者

How does the following regex pattern work?

I'm horrible with regex but i'm trying to figure out how an import function works and i came across this regex pattern. Maybe one of you can help me understand how it works.

string pattern = @"^""(?<code>.*)"",""(?<last_name>.*)"",""(?<first_name>.*)"",""(?<address>.*)"",""(?<city>.*)"",""(?<state>.*)"",""(?<zip>.*)""$";
Regex re = new Regex(pattern);
Match ma = re.Match(_sReader.ReadLine().Trim());
开发者_如何学C

Thanks


It looks like it's trying to split a comma delimited string (with the fields having quotes around them) into separate fields with named groups. The (?<name>...) syntax captures the fields into named groups. The ^ indicates the match has to begin at the start of the string and the $ is the end of string anchor. The .* in each group says to capture everything (any character, zero or more times) that occur between the double quotes.

Basically, it should parse the input CSV string into an array of strings that you can refer to by group name. You can reference the captured groups using ma.Groups[x] where x is an integer or you can use the group name. For example, ma.Groups["code"].


The way I read it. Its a flat file record parser.
In this case its a csv with quotes.

And it makes you a dictionary of the fields. So that you can work with the csv record easier.

Instead of having to know that the 4th field is address in the code after this, you simply reference, groups["address"] and get the 4th field.

There are more straight forward and generic ways to do this. This regex is going to be very fragile over time, if hte file is poorly delimited or if a quote is missing at the end of hte file.


Divide and Conquer! works best with regex.

    string pattern = @"^""(?<last_name>.*)"",""(?<first_name>.*)""";

    Regex re = new Regex(pattern);

    //  INPUT: make sure you input it with " double inverted commas
    //  "("bond","james")"
    Match mm = re.Match(Console.ReadLine().Trim()); 

    Console.WriteLine("Last Name:"+mm.Groups["last_name"]);
    Console.WriteLine("First Name:"+mm.Groups["first_name"]);

OUTPUT:

 Last Name:("bond
 First Name:james")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜