开发者

Any way to strip namespace garbage from XML file?

I need to select some nodes from an XML file (AppNamespace.xaml from a Silverlight XAP file, not that it matters), but the file has namespace stuff so XPath doesn't work. I could waste most of a day trial-and-erroring the bondage-and-discipline nightmare of XmlNamespaceManager and end up with hopelessly fragile code that can't tolerate the 开发者_StackOverflow中文版slightest variation in the input file (not a great idea in production code), or I could use the ludicrous local-name() syntax[1].

But it would be more convenient to use XPath as a human-readable query language that can be used to return specified nodes or attribute values from arbitrary XML files.

So is there any way to strip the line-noise out of the file? Or am I stuck? Is the labyrinthine imbecility of Linq-to-XML truly the lesser evil?

[1]

//*[local-name() = 'Deployment']/*[local-name() = 'Deployment.Parts']/*[local-name() = 'AssemblyPart']/@*[local-name()='Name']

Update

Five years down the road, I stand behind the term "labyrinthine imbecility" with every fiber of my being, except for a few fibers that want to use something much stronger.


Ed, here's an example of using namespaces with the System.Xml.XPath Extensions class. I've modified it to match the input you're looking at:

string markup = @"
<Deployment xmlns="http://schemas.microsoft.com/client/2007/deployment"
      xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" ...>
  <Deployment.Parts>
    <AssemblyPart x:Name="xamlName" Source="assembly" />
  </Deployment.Parts>
</Deployment>
";

XmlReader reader = XmlReader.Create(new StringReader(markup));
XElement root = XElement.Load(reader);

XmlNameTable nameTable = reader.NameTable;
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(nameTable);
nsm.AddNamespace("x", "http://schemas.microsoft.com/winfx/2006/xaml");
nsm.AddNamespace("dep", "http://schemas.microsoft.com/client/2007/deployment");

IEnumerable<XElement> elements =
   root.XPathSelectElements("//dep:Deployment/dep:Deployment.Parts/dep:AssemblyPart/@x:Name", nsm);
foreach (XElement el in elements)
    Console.WriteLine(el);

Not very complicated. Obviously you already know about XmlNamespaceManager, but I think you got a worse impression of it than it deserves.

When you say "hopelessly fragile code that can't tolerate the slightest variation in the input file", are you blaming namespaces in general, or XmlNamespaceManager? I don't see how either one makes it fragile... any more so than XML processing code without namespaces will not tolerate certain changes in the input document, but will tolerate others.

Have a little respect for other intelligent people in the industry, take a little time to understand the advantages behind a design before you dismiss it, and you will usually find that there are good reasons for what was done.

Not that XML namespaces couldn't be improved upon. However nobody has managed to produce a better standard and get it accepted by the community.


In XPath 2.0 you can use namespace wildcards (if you know what you are doing):

//*:Deployment/*:Deployment.Parts/*:AssemblyPart/@Name

btw. If an attribute doesn't have a prefix it is in no namespace at all. As this is most often the case, I guess, you don't need local-name() for the attribute.


I came here as a result of this search:

Any way to strip namespace garbage from XML file?

and I am adding an "Answer" to cheer on your "5 years on" update.

I was motivated to do this because I have an XML document that uses a tonne of namespaces -

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:x2="urn:schemas-microsoft-com:office:excel2" version="1.0" exclude-result-prefixes="msxsl">

and APPARENTLY I have to know what all those namespaces are in advance in order to hard code the XmlNamespaceManager, or write some code that parses the namespace declarations and adds the relevant name spaces myself. Why in the name of all that is holy does the XmlDocument not manage to do that all by itself?

XmlDocument databaseXml = new XmlDocument();
databaseXml.LoadXml(xslt.XslTransform);
var dbnsmgr = new XmlNamespaceManager(databaseXml.NameTable);
dbnsmgr.AddNamespace("xsl", "http://www.w3.org/1999/XSL/Transform");
dbnsmgr.AddNamespace("ss", "urn:schemas-microsoft-com:office:spreadsheet");
XmlElement databaseStylesElement = (XmlElement)database
Xml.DocumentElement.SelectSingleNode("/xsl:stylesheet/xsl:template");
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜