开发者

Avoiding Redundancies in XML documents

I was working with a certain XML where there were no redundancies

<person>
  <eye>
    <eye_info>
       <eye_color>
       blue
       </eye_color>
    </eye_info>
  </eye>
  <hair>
    <hair_info>
       <hair_color>
       blue
       </hair_color>
    </hair_info>
  </hair>
</person>

As you can see, the sub-tag eye-color makes reference to eye in it's name, so there was no need to avoid redundancies, I could get the eye color in a single line after loading the XML into a dataset:

dataset.ReadXml(path);
value = dataset.Tables("eye_info").Rows(0)("eye_color");

I do realise it's not the smartest way of doing so, and this situation I'm having now wasn't unforeseen.

Now, let's say I have to read xml's that are in this format:

<person>
  <eye>
    <info>
       <color>
       blue
       </color>
    </info>
  </eye&g开发者_如何学JAVAt;
  <hair>
    <info>
       <color>
       blue
       </color>
    </info>
  </hair>
</person>

So If I try to call it like this:

dataset.ReadXml(path);
value = dataset.Tables("info").Rows(0)("color");

There will be a redundancy, because I could only go as far as one up level to identify a single field in a XML with my previous method, and the 'disambiguator' is three levels above.

Is there a practical way to reach with no mistake a single field given all the above (or at least a few) fields ?

--[EDIT]--

I've made another question asking how I could get a certain node with linq, check it out.


You can also employ Linq to XML (System.Xml.Linq namespace) and retrieve your data something like this

string xml = @"<persons>
<person> 
  <eye> 
    <info> 
       <color>blue</color> 
    </info> 
  </eye> 
  <hair> 
    <info> 
       <color>blonde</color> 
    </info> 
  </hair> 
</person>
<person> 
  <eye> 
    <info> 
       <color>green</color> 
    </info> 
  </eye> 
  <hair> 
    <info> 
       <color>brown</color> 
    </info> 
  </hair> 
</person>
</persons>";

XDocument document = XDocument.Parse(xml);

var query = from person in document.Descendants("person")
            select new
            {
                EyeColor = person.Element("eye").Element("info").Element("color").Value,
                HairColor = person.Element("hair").Element("info").Element("color").Value
            };

foreach (var person in query)
    Console.WriteLine("{0}\t{1}", person.EyeColor, person.HairColor);


There is a whole standard organzied around querying data from an XML document. The standard is called XPath and C# has an implementation. It's not the easiest thing to learn out of the box but it is one of the best techniques for extracting data from XML and well worth learning.

Here is an example.

EDIT: I would recommend you figure out LINQ to XML however as it's more powerful but if you still want XPath then your specific question would look something like (I don't have VS on this computer so I couldn't verify this)...

XPathDocument doc = new XPathDocument(new StringReader(xmlString));
XPathNavigator nav = doc.CreateNavigator();

// Compile a standard XPath expression
XPathExpression expr = nav.Compile("/person/eye/eye_info/eye_color");
expr = nav.Compile("/catalog/cd/price");
XPathNodeIterator iterator = nav.Select(expr);

// Iterate on the node set
while (iterator.MoveNext())
{
   XPathNavigator nav2 = iterator.Current.Clone();
   Console.WriteLine(nav2.Value);
}


Remember that the DataSet class can only understand XML that can be translated into the form of a relational database. In particular, it will not handle a single child "table" that has multiple parent "tables". Having an "info" element as the child both of person, hair and eye is an example of this.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜