开发者

Merge two XElements

I'm not quite sure how to ask this, or if this even exists, but I have a need to merge two XElements with one taking precendence over the other, to become just one element.

The preference here is VB.NET and Linq, but any language would be helpful if it demonstrates how to do this without me coding to manually pick apart and and resolve every single element and attribute.

For example, let's say I have two elements. Humor me on them being as different as they are.

1.

<HockeyPlayer height="6.0" hand="left">
<Position>Center</Position>
<Idol>Gordie Howe</Idol>
</HockeyPlayer>

2.开发者_Go百科

<HockeyPlayer height="5.9" startinglineup="yes">
<Idol confirmed="yes">Wayne Gretzky</Idol>
</HockeyPlayer>

The result of a merge would be

<HockeyPlayer height="6.0" hand="left" startinglineup="yes">
<Position>Center</Position>
<Idol confirmed="yes">Gordie Howe</Idol>
</HockeyPlayer>

Notice a few things: the height attribute value of #1 overrode #2. The hand attribute and value was simply copied over from #1 (it doesn't exist in #2). The startinglineup attribute and value from #2 was copied over (it doesn't exist in #1). The Position element in #1 was copied over (it doesn't exist in #2). The Idol element value in #1 overrides #2, but #2's attribute of confirmed (it doesn't exist in #1) is copied over.

Net net, #1 takes precendence over #2 where there is a conflict (meaning both have the same elements and/or attributes) and where there is no conflict, they both copy to the final result.

I've tried searching on this, but just can't seem to find anything, possibly because the words I'm using to search are too generic. Any thoughts or solutions (esp. for Linq)?


For the sake of others looking for the same thing, as I assume both the people contributing have long since lost interest... I needed to do something similar but a little more complete. Still not totally complete though, as the XMLDoc says it does not handle non-element content well, but I don't need to as my non-element content is either text or unimportant. Feel free to enhance and re-post... Oh and it's C# 4.0 as that's what I use...

/// <summary>
/// Provides facilities to merge 2 XElement or XML files. 
/// <para>
/// Where the LHS holds an element with non-element content and the RHS holds 
/// a tree, the LHS non-element content will be applied as text and the RHS 
/// tree ignored. 
/// </para>
/// <para>
/// This does not handle anything other than element and text nodes (infact 
/// anything other than element is treated as text). Thus comments in the 
/// source XML are likely to be lost.
/// </para>
/// <remarks>You can pass <see cref="XDocument.Root"/> if it you have XDocs 
/// to work with:
/// <code>
/// XDocument mergedDoc = new XDocument(MergeElements(lhsDoc.Root, rhsDoc.Root);
/// </code></remarks>
/// </summary>
public class XmlMerging
{
    /// <summary>
    /// Produce an XML file that is made up of the unique data from both
    /// the LHS file and the RHS file. Where there are duplicates the LHS will 
    /// be treated as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to 
    /// write the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static bool TryMergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        try
        {
            MergeXmlFiles(lhsPath, rhsPath, resultPath);
        }
        catch (Exception)
        {
            // could integrate your logging here
            return false;
        }
        return true;
    }

    /// <summary>
    /// Produce an XML file that is made up of the unique data from both the LHS
    /// file and the RHS file. Where there are duplicates the LHS will be treated 
    /// as master
    /// </summary>
    /// <param name="lhsPath">XML file to base the merge off. This will override 
    /// the RHS where there are clashes</param>
    /// <param name="rhsPath">XML file to enrich the merge with</param>
    /// <param name="resultPath">The fully qualified file name in which to write 
    /// the resulting merged XML</param>
    /// <param name="options"> Specifies the options to apply when saving. 
    /// Default is <see cref="SaveOptions.OmitDuplicateNamespaces"/></param>
    public static void MergeXmlFiles(string lhsPath, string rhsPath, 
        string resultPath, SaveOptions options = SaveOptions.OmitDuplicateNamespaces)
    {
        XElement result = 
            MergeElements(XElement.Load(lhsPath), XElement.Load(rhsPath));
        result.Save(resultPath, options);
    }

    /// <summary>
    /// Produce a resulting <see cref="XElement"/> that is made up of the unique 
    /// data from both the LHS element and the RHS element. Where there are 
    /// duplicates the LHS will be treated as master
    /// </summary>
    /// <param name="lhs">XML Element tree to base the merge off. This will 
    /// override the RHS where there are clashes</param>
    /// <param name="rhs">XML element tree to enrich the merge with</param>
    /// <returns>A merge of the left hand side and right hand side element 
    /// trees treating the LHS as master in conflicts</returns>
    public static XElement MergeElements(XElement lhs, XElement rhs)
    {
        // if either of the sides of the merge are empty then return the other... 
        // if they both are then we return null
        if (rhs == null) return lhs;
        if (lhs == null) return rhs;

        // Otherwise build a new result based on the root of the lhs (again lhs 
        // is taken as master)
        XElement result = new XElement(lhs.Name);

        MergeAttributes(result, lhs.Attributes(), rhs.Attributes());

        // now add the lhs child elements merged to the RHS elements if there are any
        MergeSubElements(result, lhs, rhs);
        return result;
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// attribute collections.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master
    /// </summary>
    /// <param name="elementToUpdate">The element to take the merged attribute 
    /// collection</param>
    /// <param name="lhs">The master set of attributes</param>
    /// <param name="rhs">The attributes to enrich the merge</param>
    private static void MergeAttributes(XElement elementToUpdate, 
        IEnumerable<XAttribute> lhs, IEnumerable<XAttribute> rhs)
    {
        // Add in the attribs of the lhs... we will only add new attribs from 
        // the rhs duplicates will be ignored as lhs is master
        elementToUpdate.Add(lhs);

        // collapse the element names to save multiple evaluations... also why 
        // we ain't putting this in as a sub-query
        List<XName> lhsAttributeNames = 
            lhs.Select(attribute => attribute.Name).ToList();
        // so add in any missing attributes
        elementToUpdate.Add(rhs.Where(attribute => 
            !lhsAttributeNames.Contains(attribute.Name)));
    }

    /// <summary>
    /// Enrich the passed in <see cref="XElement"/> with the contents of both 
    /// <see cref="XElement.Elements()"/> subtrees.
    /// Again where the RHS conflicts with the LHS, the LHS is deemed the master.
    /// Where the passed elements do not have element subtrees, but do have text 
    /// content that will be used. Again the LHS will dominate
    /// </summary>
    /// <remarks>Where the LHS has text content and no subtree, but the RHS has 
    /// a subtree; the LHS text content will be used and the RHS tree ignored. 
    /// This may be unexpected but is consistent with other .NET XML 
    /// operations</remarks>
    /// <param name="elementToUpdate">The element to take the merged element 
    /// collection</param>
    /// <param name="lhs">The element from which to extract the master 
    /// subtree</param>
    /// <param name="rhs">The element from which to extract the subtree to 
    /// enrich the merge</param>
    private static void MergeSubElements(XElement elementToUpdate, 
        XElement lhs, XElement rhs)
    {
        // see below for the special case where there are no children on the LHS
        if (lhs.Elements().Count() > 0)
        {
            // collapse the element names to a list to save multiple evaluations...
            // also why we ain't putting this in as a sub-query later
            List<XName> lhsElementNames = 
                lhs.Elements().Select(element => element.Name).ToList();

            // Add in the elements of the lhs and merge in any elements of the 
            //same name on the RHS
            elementToUpdate.Add(
                lhs.Elements().Select(
                    lhsElement => 
                        MergeElements(lhsElement, rhs.Element(lhsElement.Name))));

            // so add in any missing elements from the rhs
            elementToUpdate.Add(rhs.Elements().Where(element => 
                !lhsElementNames.Contains(element.Name)));
        }
        else
        {
            // special case for elements where they have no element children 
            // but still have content:
            // use the lhs text value if it is there
            if (!string.IsNullOrEmpty(lhs.Value))
            {
                elementToUpdate.Value = lhs.Value;
            }
            // if it isn't then see if we have any children on the right
            else if (rhs.Elements().Count() > 0)
            {
                // we do so shove them in the result unaltered
                elementToUpdate.Add(rhs.Elements());
            }
            else
            {
                // nope then use the text value (doen't matter if it is empty 
                //as we have nothing better elsewhere)
                elementToUpdate.Value = rhs.Value;
            }
        }
    }
}


Here's a console app that produces the result listed in your question. It uses recursion to process each sub element. The one thing it doesn't check for is child elements that appear in Elem2 that aren't in Elem1, but hopefully this will get you started towards a solution.

I'm not sure if I would say this is the best possible solution, but it does work.

Module Module1

Function MergeElements(ByVal Elem1 As XElement, ByVal Elem2 As XElement) As XElement

    If Elem2 Is Nothing Then
        Return Elem1
    End If

    Dim result = New XElement(Elem1.Name)

    For Each attr In Elem1.Attributes
        result.Add(attr)
    Next

    Dim Elem1AttributeNames = From attr In Elem1.Attributes _
                              Select attr.Name

    For Each attr In Elem2.Attributes
        If Not Elem1AttributeNames.Contains(attr.Name) Then
            result.Add(attr)
        End If
    Next

    If Elem1.Elements().Count > 0 Then
        For Each elem In Elem1.Elements
            result.Add(MergeElements(elem, Elem2.Element(elem.Name)))
        Next
    Else
        result.Value = Elem1.Value
    End If

    Return result
End Function

Sub Main()
    Dim Elem1 = <HockeyPlayer height="6.0" hand="left">
                    <Position>Center</Position>
                    <Idol>Gordie Howe</Idol>
                </HockeyPlayer>

    Dim Elem2 = <HockeyPlayer height="5.9" startinglineup="yes">
                    <Idol confirmed="yes">Wayne Gretzky</Idol>
                </HockeyPlayer>

    Console.WriteLine(MergeElements(Elem1, Elem2))
    Console.ReadLine()
End Sub

End Module

Edit: I just noticed that the function was missing As XElement. I'm actually surprised that it worked without that! I work with VB.NET every day, but it has some quirks that I still don't totally understand.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜