How to load and merge a dataset of XML docs
I would like to consume a dataset of XML documents, and merge them into a single document containing only distinct elements.
To illustrate, I have a dataset as:
r, x
-- -------------------------------
1, <root><a>111</a></root>
2, <root><a>222</a><b>222</b></root>
3, <root><c>333</c></root>
would result in:
<a>111</a><b>222</b><c>333</c>
The <a>
element from r=2 is not merged since we already have an element = <a>
from r=1. I need only merge new elements, starting with r=1 going forward.
I am able to iterate over the list, but having difficulty comparing and merging. The code below fails to identify <a>222</a>
as a duplicate. Is it possibly comparing the element values as well?
using (SqlDataReader dsReader = cmd.ExecuteReader())
{
XDocument baseDoc = new XDocument();
XDocument childDoc = new XDocument();
while (dsReader.Read())
{
// this is the base doc, merge forward from here
if (dsReader["r"].ToString() == "1")
{
baseDoc = XDocument.Parse(dsReader["x"].ToString());
SqlContext.Pipe.Send("start:" + baseDoc.ToString());
}
// this is a child doc, do merge operation
else
{
childDoc = XDoc开发者_如何学JAVAument.Parse(dsReader["x"].ToString());
// find elements only present in child
var childOnly = (childDoc.Descendants("root").Elements()).Except(baseDoc.Descendants("root").Elements());
foreach (var e in childOnly)
{
baseDoc.Root.Add(e);
}
}
}
}
I am bit confused about baseDoc and childDoc usage in your code. I hope I correctly understood your question. Here is my proposal:
using (SqlDataReader dsReader = cmd.ExecuteReader())
{
XElement result = new XElement("root");
while (dsReader.Read())
{
// Read source
XDocument srcDoc = XDocument.Parse(dsReader["x"].ToString());
// Construct result element
foreach (XElement baseElement in srcDoc.Descendants("root").Elements())
if (result.Element(baseElement.Name) == null) // skip already added nodes
result.Add(new XElement(baseElement.Name, baseElement.Value));
}
// Construct result string from sub-elements (to avoid "<root>..</root>" in output)
string str = "";
foreach (XElement element in result.Elements())
str += element.ToString();
// send the result
SqlContext.Pipe.Send("start:" + str);
}
Note that my code ignores r-numbering. I use order as it comes from sql data reader. If rows are not sorted by "r", then additional sort is required before my code.
精彩评论