开发者

Extracting data from text file with differing delimiters

I have a text file that I need to split into an array, each element of the array will contain data for 1 person. I will then use Regex (C#) to extract all the data for that person. The problem I am having is matching the start of each person as the pattern changes within the file. See below:

A simplified version of the data is below:

Address FirstName \r\nSurname NHS No Age = 44\r\n

Address FirstName\r\n Surname NHS No 12345\r\n

Address FirstName\r\n Surname NHS No Age = 35\r\n

Address FirstName \r\nSurname NHS No 54321\r\n

As you can see there are linebreaks within the file 开发者_开发知识库so StreamReader.Readline() method probably won't work. The address name and surname fields are fixed length fields and I can extract these using substring. I can split into the array of people once I have a consistent marker for the start/end of each person.

I need to use Regex.Replace to add a start of person marker, then use this marker to split into the array. I would appreciate any help with this.


Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems. Jamie Zawinski

Are you convinced that regex will make your code easier to write, read and maintain?

Consider using String.Split() instead.


From your comments, it looks like each row represents a single entity, regardless of the nuances of the format. For start, you could read the file line by line, and split each line into words using String.Split:

using (StreamReader sr = new StreamReader("addresses.txt")) 
{
     string line;
     // Read and display lines from the file until the end of 
     // the file is reached.
     while ((line = sr.ReadLine()) != null) 
     {
         string[] tokens = line.Split(' ');

         // variant 1: Address FirstName Surname NHS No //Person1 Age = 44
         // variant 2: Address FirstName Surname NHS No //person 2 12345

         Console.Writeline("Address: ", tokens[0]);
         Console.Writeline("First name: ", tokens[1]);

         // etc.
     }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜