How to compare two big XML files item by item efficiently?
I plan to implement an method to compare two big XML files (but less than 10,000 element lines for each of other).
The method below works, but it doesn't well when the file more than 100 lines. It begin very slowly. How Can I find a more efficient solution. Maybe need High C# programming design or better Algorithm in C# & XML handling.
Thanks for your comments in advance.
//Remove the item which not in Event Xml and ConfAddition Xml files
XmlDocument doc = new XmlDocument();
doc.Load(xmlFile_AlarmSettingUp);
bool isNewAlid_Event = false;
bool isNewAlid_ConfAddition = false;
int alid = 0;
XmlNodeList xnList = doc.SelectNodes("/Equipment/AlarmSettingUp/EnabledALIDs/ALID");
foreach (XmlNode xn in xnList)
{
XmlAttributeCollection attCol = xn.Attributes;
for (int i = 0; i < attCol.Count; ++i)
{
if (attCol[i].Name == "alid")
{
alid = int.Parse(attCol[i].Value.ToString());
break;
}
}
//alid = int.Parse(attCol[1].Value.ToString());
XmlDocument docEvent_Alarm = new XmlDocument();
docEvent_Alarm.Load(xmlFile_Event);
XmlNodeList xnListEvent_Alarm = docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID");
foreach (XmlNode xnEvent_Alarm in xnListEvent_Alarm)
{
XmlAttributeCollection attColEvent_Alarm = xnEvent_Alarm.Attributes;
int alidEvent_Alarm = int.Parse(attColEvent_Alarm[1].Value.ToString());
if (alid == alidEvent_Alarm)
{
isNewAlid_Event = false;
break;
}
else
{
isNewAlid_Event = true;
//break;
}
}
XmlDocument docConfAddition_Alarm = new XmlDocument();
docConfAddition_Alarm.Load(xmlFile_ConfAddition);
XmlNodeList xnListConfAddition_Alarm = docConfAddition_Alarm.SelectNodes("/Equipment/Alarms/ALID");
foreach (XmlNode xnConfAddition_Alarm in xnListConfAddition_Alarm)
{
XmlAttributeCollection attColConfAddition_Alarm = xnConfAddition_Alarm.Attributes;
int alidConfAddition_Alarm = int.Parse(attColConfAddition_Alarm[1].Value.ToString());
if (alid == alidConfAddition_Alarm)
{
isNewAlid_ConfAddition = false;
break;
}
else
{
isNewAlid_ConfAddition = true;
//break;
}
}
if ( isNewAlid_Event && isNewAlid_ConfAddition )
{
// Store the root node of the destination document into an XmlNode
XmlNode rootDest = doc.SelectSingleNode("/Equipment/AlarmSettingUp/EnabledALIDs");
rootDest.RemoveChild(xn);
}
}
doc.Save(xmlFile_AlarmSettingUp);
my XML file as this. The two XML files are same style. Except some time one of them may be modified by my app. That's why I need compare them if modified.
<?xml version="1.0" encoding="utf-8"?>
<Equipment xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Licence LicenseId="" LicensePath="" />
<!--Alarm Setting Up XML File-->
<AlarmSettingUp>
<EnabledALIDs>
<ALID logicalName="Misc_EV_RM_STATION_ALREADY_RESERVED" alid="536870915" alcd="7" altx="Misc_Station 1 UnitName 2 SlotId already reserved" ceon="Misc_AlarmOn_EV_RM_STATION_ALREADY_RESERVED" ceoff="Misc_AlarmOff_EV_RM_STATION_ALREADY_RESERVED" />
<ALID logicalName="Misc_EV_RM_SEQ_READ_ERROR" alid="536870916" alcd="7" altx="Misc_Sequence ID 1 d step 2 d read error for wafer in 3 UnitName 4 SlotId" ceon="Misc_AlarmOn_EV_RM_SEQ_READ_ERROR" ceoff="Misc_AlarmOff_EV_RM_SEQ_开发者_JS百科READ_ERROR" />
...
...
...
</EnabledALIDs>
</AlarmSettingUp>
</Equipment>
The "ALID/@alid" seems to be your key, so the first thing I would do (before foreach (XmlNode xn in xnList)
) is build a dictionary (assuming this is unique) over the docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID")
@alid values - then you can do most of the work without O(n*m) performance - it'll be more O(n+m) (which is a big difference).
var lookup = new Dictionary<string, XmlElement>();
foreach(XmlElement el in docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID")) {
lookup.Add(el.GetAttribute("alid"), el);
}
then you can use:
XmlElement other;
if(lookup.TryGetValue(otherKey, out other)) {
// exists; element now in "other"
} else {
// doesn't exist
}
XmlDocument and related classes (XmlNode, ...) are not pretty fast in xml processing. Try XmlTextReader instead.
Also you call docEvent_Alarm.Load(xmlFile_Event);
and docConfAddition_Alarm.Load(xmlFile_ConfAddition);
each iteration of the parental loop - it's not good. If your xmlFile_Event
and xmlFile_ConfAddition
are persistent during all processing - better to initialize it before the main loop.
Have you tried using Microsoft's XmlDiff class? See http://msdn.microsoft.com/en-us/library/aa302294.aspx
精彩评论