C# XML --> How to get the attribute value and xpath that belongs to this attributes element

2023-03-09 09:38 问答作者：

I'm looking for a way to walk through a XML file and get for a few attributes the text and the xpath. But I have no idea how to do that. I know how to get all the text from the attributes I want, but the problem with that is that I cann't see the xpath where it is in. Can some one help me? The code =

       // XML settings
        XmlReaderSettings settings = new XmlReaderSettings();
        settings.IgnoreWhitespace = true;
        settings.IgnoreComments = true;                        

        // Loop through the XML to get all text from the right attributes
        using (XmlReader reader = XmlRead开发者_JS百科er.Create(sourceFilepathTb.Text, settings))
        {
            while (reader.Read())
            {
                if (reader.NodeType == XmlNodeType.Element)
                {
                    if (reader.HasAttributes)
                    {
                        if (reader.GetAttribute("Caption") != null)
                        {                                
                            MessageBox.Show(reader.GetAttribute("Caption"));
                        }                            
                    }
                }
            }
        }

The XML:

<?xml version="1.0" encoding="utf-8"?>
<Test Description="Test XML" VersionFormat="123" ProtectedContentText="(Test test)">
    <Testapp>
        <TestappA>
            <A Id="0" Caption="Test 0" />
            <A Id="1" Caption="Test 1" />
            <A Id="2" Caption="Test 2" />
            <A Id="3" Caption="Test 3">
                <AA>
                    <B Id="4" Caption="Test 4" />
                </AA>
            </A>
        </TestappA>
        <AA>
            <Reason Id="5" Caption="Test 5" />
            <Reason Id="6" Caption="Test 6" />
            <Reason Id="7" Caption="Test 7" />
        </AA>
    </Testapp>
</Test>

IMHO, LINQ to XML is simpler:

var document = XDocument.Load(fileName);

var captions = document.Descendants()
    .Select(arg => arg.Attribute("Caption"))
    .Where(arg => arg != null)
    .Select(arg => arg.Value)
    .ToList();

[Update]

To find XPath for each element that has Caption attribute:

var captions = document.Descendants()
    .Select(arg =>
        new
        {
            CaptionAttribute = arg.Attribute("Caption"),
            XPath = GetXPath(arg)
        })
    .Where(arg => arg.CaptionAttribute != null)
    .Select(arg => new { Caption = arg.CaptionAttribute.Value, arg.XPath })
    .ToList();

private static string GetXPath(XElement el)
{
    if (el.Parent == null)
        return "/" + el.Name.LocalName;

    var name = GetXPath(el.Parent) + "/" + el.Name.LocalName;

    if (el.Parent.Elements(el.Name).Count() != 1)
        return string.Format(@"{0}[{1}]", name, (el.ElementsBeforeSelf(el.Name).Count() + 1));
    return name;
}

Here's a start. You can workout how to prepend the leading slash.

using System;
using System.Xml;

namespace ConsoleApplication4 {
    class Program {
        static void Main(string[] args) {
            // XML settings
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.IgnoreWhitespace = true;
            settings.IgnoreComments = true;

            // Loop through the XML to get all text from the right attributes
            using ( XmlReader reader = XmlReader.Create("Test.xml", settings) ) {
                while ( reader.Read() ) {
                    if ( reader.NodeType == XmlNodeType.Element ) {
                        Console.Write(reader.LocalName + "/"); // <<<<----
                        if ( reader.HasAttributes ) {
                            if ( reader.GetAttribute("Caption") != null ) {
                                Console.WriteLine(reader.GetAttribute("Caption"));
                            }
                        }
                    }
                }
            }
            Console.Write("Press any key ..."); Console.ReadKey();
        }
    }
}

And just BTW, I try to avoid nesting code that deeply. Too hard to read.

Cheers. Keith.

EDIT: (days later)

I finally got some time to myself... So I sat down and did this "correctly". It turned out to be a lot harder than I first thought. IMHO, this recursive solution is still easier to groc than XSLT, which I find infinetely confusing ;-)

using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;

namespace ConsoleApplication4 
{
    public class XPathGrepper : IDisposable
    {
        private XmlReader _rdr;
        private TextWriter _out;

        public XPathGrepper(string xmlFilepath, TextWriter output) {
            _rdr = CreateXmlReader(xmlFilepath);
            _out = output;
        }

        private static XmlReader CreateXmlReader(string xmlFilepath) {
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.IgnoreWhitespace = true;
            settings.IgnoreComments = true;
            return XmlReader.Create(xmlFilepath, settings);
        }

        // descends through the XML, printing the xpath to each @attributeName.
        public void Attributes(string attributeName) {
            Attributes(_rdr, attributeName, "/");
        }
        // a recursive XML-tree descent, printing the xpath to each @attributeName.
        private void Attributes(XmlReader rdr, string attrName, string path) {
            // skip the containing element of the subtree (except root)
            if ( "/" != path ) 
                rdr.Read();
            // count how many times we've seen each distinct path.
            var kids = new Histogram();
            // foreach node at-this-level in the tree
            while ( rdr.Read() ) {
                if (rdr.NodeType == XmlNodeType.Element) {
                    // build the xpath-string to this element
                    string nodePath = path + _rdr.LocalName;
                    nodePath += "[" + kids.Increment(nodePath) + "]/";
                    // print the xpath to the Caption attribute of this node
                    if ( _rdr.HasAttributes && _rdr.GetAttribute(attrName) != null ) {
                        _out.WriteLine(nodePath + "@" + attrName);
                    }
                    // recursively read the subtree of this element.
                    Attributes(rdr.ReadSubtree(), attrName, nodePath);
                }
            }
        }

        public void Dispose() {
            if ( _rdr != null ) _rdr.Close();
        }

        private static void Pause() {
            Console.Write("Press enter to continue....");
            Console.ReadLine();
        }

        static void Main(string[] args) {
            using ( var grep = new XPathGrepper("Test.xml", Console.Out) ) {
                grep.Attributes("Caption");
            }
            Pause();
        }

        private class Histogram : Dictionary<string, int>
        {
            public int Increment(string key) {
                if ( base.ContainsKey(key) )
                    base[key] += 1;
                else
                    base.Add(key, 1);
                return base[key];
            }
        }

    }
}

A simple and precise XSLT solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="/">
  <xsl:apply-templates select="//@Caption"/>
 </xsl:template>

 <xsl:template match="@Caption">
  <xsl:apply-templates select="." mode="path"/>
  <xsl:value-of select="concat(': ',.,'&#xA;')"/>
 </xsl:template>

 <xsl:template match="@Caption" mode="path">
  <xsl:for-each select="ancestor::*">
   <xsl:value-of select="concat('/',name())"/>

   <xsl:variable name="vSiblings" select=
   "count(../*[name()=name(current())])"/>

   <xsl:if test="$vSiblings > 1">
     <xsl:value-of select="
     concat('[',
              count(preceding-sibling::*
                [name()=name(current())]) +1,
            ']'
           )"/>
   </xsl:if>
  </xsl:for-each>

  <xsl:text>/@Caption</xsl:text>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<Test Description="Test XML" VersionFormat="123" ProtectedContentText="(Test test)">
    <Testapp>
        <TestappA>
            <A Id="0" Caption="Test 0" />
            <A Id="1" Caption="Test 1" />
            <A Id="2" Caption="Test 2" />
            <A Id="3" Caption="Test 3">
                <AA>
                    <B Id="4" Caption="Test 4" />
                </AA>
            </A>
        </TestappA>
        <AA>
            <Reason Id="5" Caption="Test 5" />
            <Reason Id="6" Caption="Test 6" />
            <Reason Id="7" Caption="Test 7" />
        </AA>
    </Testapp>
</Test>

the wanted, correct result is produced:

/Test/Testapp/TestappA/A[1]/@Caption: Test 0
/Test/Testapp/TestappA/A[2]/@Caption: Test 1
/Test/Testapp/TestappA/A[3]/@Caption: Test 2
/Test/Testapp/TestappA/A[4]/@Caption: Test 3
/Test/Testapp/TestappA/A[4]/AA/B/@Caption: Test 4
/Test/Testapp/AA/Reason[1]/@Caption: Test 5
/Test/Testapp/AA/Reason[2]/@Caption: Test 6
/Test/Testapp/AA/Reason[3]/@Caption: Test 7

Do note: This is the only solution presented so far, that generates the exact XPath expression for any single Caption attribute.

/Test/Testapp/TestappA/A/@Caption

selects 4 attribute nodes, whereas:

/Test/Testapp/TestappA/A[2]/@Caption

selects just a single attribute node ans is what really is wanted.

继续阅读：xml xml-parsing

C# XML --> How to get the attribute value and xpath that belongs to this attributes element

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？