开发者

View all text of an element with XmlReader C#

I'm using an XmlReader to iterate through some XML. Some of the XML is actually HTML and I want to get the text content from the node.

Example XML:

<?xml version="1.0" encoding="UTF-8"?>
<data>
  <p>Here is some <b>data</b></p>
</data>

Example code:

using (XmlReader reader = new XmlReader(myUrl))
{
  while (reader.Read()) 
  {
    if (reader.Name == "p")
    { 
      // I want to get all the TEXT contents from the this n开发者_运维问答ode
      myVar = reader.Value;
    }
  }
}

This doesn't get me all the contents. How do I get all the contents from the

node in that situation?


Use ReadInnerXml:

        StringReader myUrl = new StringReader(@"<?xml version=""1.0"" encoding=""UTF-8""?>
<data>
  <p>Here is some <b>data</b></p>
</data>");
        using (XmlReader reader = XmlReader.Create(myUrl))
        {
            while (reader.Read())
            {
                if (reader.Name == "p")
                {
                    // I want to get all the TEXT contents from the this node
                    Console.WriteLine(reader.ReadInnerXml());
                }
            }
        }

Or if you want to skip the <b> as well, you can use an aux reader for the subtree, and only read the text nodes:

        StringReader myUrl = new StringReader(@"<?xml version=""1.0"" encoding=""UTF-8""?>
<data>
  <p>Here is some <b>data</b></p>
</data>");
        StringBuilder myVar = new StringBuilder();
        using (XmlReader reader = XmlReader.Create(myUrl))
        {
            while (reader.Read())
            {
                if (reader.Name == "p")
                {
                    XmlReader pReader = reader.ReadSubtree();
                    while (pReader.Read())
                    {
                        if (pReader.NodeType == XmlNodeType.Text)
                        {
                            myVar.Append(pReader.Value);
                        }
                    }
                }
            }
        }

        Console.WriteLine(myVar.ToString());


I can't upvote or comment on others' responses, so let me just say carlosfigueira hit the nail on the head, that's exactly how you read the text value of an element. his answer helped me immensely.

for the sake of exmeplification here's my code:

while (reader.Read())
{
   switch (reader.NodeType)
   {
       case XmlNodeType.Element:
       {
           if (reader.Name == "CharCode")
           {
               switch (reader.ReadInnerXml())
               {
                   case "EUR":
                   {
                        reader.ReadToNextSibling("Value");
                        label4.Text = reader.ReadInnerXml();
                   }
                   break;
                   case "USD":
                   {
                        reader.ReadToNextSibling("Value");
                        label3.Text = reader.ReadInnerXml();
                   }
                   break;
                   case "RUB":
                   {
                        reader.ReadToNextSibling("Value");
                        label5.Text = reader.ReadInnerXml();
                   }
                   break;
                   case "RON":
                   {
                        reader.ReadToNextSibling("Value");
                        label6.Text = reader.ReadInnerXml();
                   }
                   break;
               }
           }
        }
        break;
    }
}

the file I'm reading can be found here: http://www.bnm.md/md/official_exchange_rates?get_xml=1&date= (you have to add a date in DD.MM.YYYY format to it to get the .XML)


I suggest you use HtmlAgilityPack which is a mature and, stable library for doing this sort of thing. It takes care of fetching the html, converting it to xml, and allows you to select the nodes you'd like with XPATH.

In your case it would be as simple as executing

        HtmlDocument doc = new HtmlWeb().Load(myUrl);
        string text = doc.DocumentNode.SelectSingleNode("/data/p").InnerText;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜