开发者

how to parse this text in c#

abc  = tamaz feeo maa roo key gaera porla
Xyz = gippaza eka jaguar ammaz te sanna.

i want to make a struct

public struct word
{
 public string Word;
 public string开发者_Python百科 Definition;
}

how i can parse them and make a list of <word> in c#.

how i can parse it in c#

thanks for help but it is a text and it is not sure that a line or more so what i do for newline


Read the input line by line and split by the equal sign.

class Entry
{
    private string term;
    private string definition;

    Entry(string term, string definition)
    {
        this.term = term;
        this.definition = definition;
    }
}

// ...

string[] data = line.Split('=');
string word = data[0].Trim();
string definition = data[1].Trim();

Entry entry = new Entry(word, definition);


This can also be done using a very simple LINQ query:

var definitions =
    from line in File.ReadAllLines(file)
    let parts = line.Split('=')
    select new word
        {
            Word = parts[0].Trim(),
            Definition = parts[1].Trim()
        }


Using RegExp you can proceed in two ways, depending on your source input


Exemple 1

Assuming you have read your source and saved any single line in a vector or list :

string[] input = { "abc  = tamaz feeo maa roo key gaera porla", "Xyz = gippaza eka jaguar ammaz te sanna." };

 Regex mySplit = new Regex("(\\w+)\\s*=\\s*((\\w+).*)");

 List<word> mylist = new List<word>();

 foreach (string wordDef in input)
 {
      Match myMatch = mySplit.Match(wordDef);

      word myWord;

      myWord.Word = myMatch.Groups[1].Captures[0].Value;
      myWord.Definition = myMatch.Groups[2].Captures[0].Value;

       mylist.Add(myWord);
 }

Exemple 2

Assuming you have read your source in a single variable (and any line is terminated with the line break character '\n') you can use the same regexp "(\w+)\s*=\s*((\w+).*)" but in this way

string inputs = "abc  = tamaz feeo maa roo, key gaera porla\r\nXyz = gippaza eka jaguar; ammaz: te sanna.";

MatchCollection myMatches = mySplit.Matches(inputs);

foreach (Match singleMatch in myMatches)
{

    word myWord;

    myWord.Word = singleMatch.Groups[1].Captures[0].Value;
    myWord.Definition = singleMatch.Groups[2].Captures[0].Value;

    mylist.Add(myWord);
}

Lines that matches or does not match the regexp "(\w+)\s=\s*((\w+).)":

  • "abc = tamaz feeo maa roo key gaera porla,qsdsdsqdqsd\n" --> Match!
  • "Xyz= gippaza eka jaguar ammaz te sanna. sdq=sqds \n" --> Match! you can insert description that includes spaces too.
  • "qsdqsd=\nsdsdsd\n" --> Match a multiline pair too!
  • "sdqsd=\n" --> DO NOT Match! (lacking descr)
  • "= sdq sqdqsd.\n" --> DO NOT Match! (lacking word)


// Split at an = sign. Take at most two parts (word and definition); 
//    ignore any = signs in the definition
string[] parts = line.Split(new[] { '=' }, 2);

word w = new word();
w.Word = parts[0].Trim();

// If the definition is missing then parts.Length == 1
if (parts.Length == 1)
    w.Definition = string.Empty;
else
    w.Definition = parts[1].Trim();

words.Add(w);


Use Regular Expressions

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜