How do I remove duplicate xml element values in an XDocument?
I have the following code. I want to be able to check for and remove duplicate element values that are contained in the StateRequestRecordGUID
element. Here is an example xml file that needs to be corrected.
<?xml version="1.0"?>
<StateSeparationRequestCollection xsi:schemaLocation="https://uidataexchange.org/schemas SeparationRequest.xsd" xmlns="https://uidataexchange.org/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StateSeparationRequest>
<StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
<SSN>999999999</SSN>
<ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
<ClaimNumber>012345678901234567</ClaimNumber>
<StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
<EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
<FEIN>794741844</FEIN>
<TypeofEmployerCode>1</TypeofEmployerCode>
<TypeofClaimCode>1</TypeofClaimCode>
<BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
<RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
<UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
<UIOfficePhone>6085264400</UIOfficePhone>
<UIOfficeFax>6085269394</UIOfficeFax>
<ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
<OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
<ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
<ClaimantMiddleInitial>T</ClaimantMiddleInitial>
<ClaimantSuffix>Jr.-4567</ClaimantSuffix>
<ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
<ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
<ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
<WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
<WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
<WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
<ClaimantSepReasonCode>1</ClaimantSepReasonCode>
<ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
<ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
<RequestDate>2006-06-07</RequestDate>
<ResponseDueDate>2006-06-17</ResponseDueDate>
<FormNumber>606C</FormNumber>
</StateSeparationRequest>
<StateSeparationRequest>
<StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
<SSN>999999999</SSN>
<ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
<ClaimNumber>012345678901234567</ClaimNumber>
<StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
<EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
<FEIN>794741844</FEIN>
<TypeofEmployerCode>1</TypeofEmployerCode>
<TypeofClaimCode>1</TypeofClaimCode>
<BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
<RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
<UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
<UIOfficePhone>6085264400</UIOfficePhone>
<UIOfficeFax>6085269394</UIOfficeFax>
<ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
<OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
<ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
<ClaimantMiddleInitial>T</ClaimantMiddleInitial>
<ClaimantSuffix>Jr.-4567</ClaimantSuffix>
<ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
<ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
<ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
<WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
<WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
开发者_如何转开发 <WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
<ClaimantSepReasonCode>1</ClaimantSepReasonCode>
<ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
<ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
<RequestDate>2006-06-07</RequestDate>
<ResponseDueDate>2006-06-17</ResponseDueDate>
<FormNumber>606C</FormNumber>
</StateSeparationRequest>
</StateSeparationRequestCollection>
Here would be the corrected xml:
<?xml version="1.0"?>
<StateSeparationRequestCollection xsi:schemaLocation="https://uidataexchange.org/schemas SeparationRequest.xsd" xmlns="https://uidataexchange.org/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StateSeparationRequest>
<StateRequestRecordGUID>30000000000000000000000000004000</StateRequestRecordGUID>
<SSN>999999999</SSN>
<ClaimEffectiveDate>2007-06-04</ClaimEffectiveDate>
<ClaimNumber>012345678901234567</ClaimNumber>
<StateEmployerAccountNbr>01234567890123456789</StateEmployerAccountNbr>
<EmployerName>JC PENNEY COMPANY INC 234567890123456789012345678901234567890123456789012345678901234567890123456789</EmployerName>
<FEIN>794741844</FEIN>
<TypeofEmployerCode>1</TypeofEmployerCode>
<TypeofClaimCode>1</TypeofClaimCode>
<BenefitYearBeginDate>2007-06-04</BenefitYearBeginDate>
<RequestingStateAbbreviation>ST</RequestingStateAbbreviation>
<UIOfficeName>Park Oaks 012345678901234</UIOfficeName>
<UIOfficePhone>6085264400</UIOfficePhone>
<UIOfficeFax>6085269394</UIOfficeFax>
<ClaimantLastName>SMITH-678901234567890123456789</ClaimantLastName>
<OtherLastName>WILLIAMS-901234567890123456789</OtherLastName>
<ClaimantFirstName>JOHN-56789012345678901234</ClaimantFirstName>
<ClaimantMiddleInitial>T</ClaimantMiddleInitial>
<ClaimantSuffix>Jr.-4567</ClaimantSuffix>
<ClaimantJobTitle>Manager-8901234567890123456789</ClaimantJobTitle>
<ClaimantReportedFirstDayofWork>2006-01-04</ClaimantReportedFirstDayofWork>
<ClaimantReportedLastDayofWork>2007-05-31</ClaimantReportedLastDayofWork>
<WagesWeeksNeededCode>WO</WagesWeeksNeededCode>
<WagesNeededBeginDate>2005-05-01</WagesNeededBeginDate>
<WagesNeededEndDate>2005-05-30</WagesNeededEndDate>
<ClaimantSepReasonCode>1</ClaimantSepReasonCode>
<ClaimantSepReasonComments>AAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAA</ClaimantSepReasonComments>
<ReturntoWorkDate>2010-01-01</ReturntoWorkDate>
<RequestDate>2006-06-07</RequestDate>
<ResponseDueDate>2006-06-17</ResponseDueDate>
<FormNumber>606C</FormNumber>
</StateSeparationRequest>
</StateSeparationRequestCollection>
int TotalCount = ssrcWrapper.EmployerTPASeparationRequestCollection.EmployerTPASeparationRequest.Count();
string ssrcWrapperString = XmlSerializerUtils.SerializeToXMLstring(ssrcWrapper.EmployerTPASeparationRequestCollection);
System.IO.StringReader myStringReader = new System.IO.StringReader(ssrcWrapperString);
XmlReader xmlreader = XmlReader.Create(myStringReader);
xmlreader.MoveToContent();
for (int i = 0; i < TotalCount; i++)
{
SqlConnection conn4 = new SqlConnection("Data Source=.\\sqlexpress;Initial Catalog=test_BdbCSSQL01;Persist Security Info=False;Integrated Security=SSPI;");
conn4.Open();
string sql = "SELECT * FROM SIDESStagingIN";
SqlDataAdapter da = new SqlDataAdapter(sql, conn4);
DataTable dt = new DataTable();
da.Fill(dt);
DataRow dr;
dr = dt.NewRow();
dt.Rows.Add(dr);
XDocument doc = XDocument.Load(xmlreader);
XNamespace ns = "https://uidataexchange.org/schemas";
var node = doc.Descendants(ns + "EmployerTPASeparationRequest");
var node2 = node.ElementAt(i);
string _StateRequestRecordGUID = "";
foreach (var element in node2.Elements())
{
if (element.Name.LocalName == "StateRequestRecordGUID")
{
_StateRequestRecordGUID = element.Value;
}
if (element.Name.LocalName == "AttachmentOccurrence")
{
//ZJR: TODO: Create new XDoc and write values to dbo.SIDESAttachmentIN
SqlConnection conn5 = new SqlConnection("Data Source=.\\sqlexpress;Initial Catalog=test_BdbCSSQL01;Persist Security Info=False;Integrated Security=SSPI;");
conn5.Open();
string sql2 = "SELECT * FROM SIDESAttachmentIN";
SqlDataAdapter da2 = new SqlDataAdapter(sql2, conn5);
DataTable dt2 = new DataTable();
da2.Fill(dt2);
DataRow dr2;
dr2 = dt2.NewRow();
dt2.Rows.Add(dr2);
dr2["AttachmentID"] = _StateRequestRecordGUID;
var attachmentNode = doc.Descendants(ns + "AttachmentOccurrence");
foreach (var attachmentElement in attachmentNode.Elements())
{
dr2[attachmentElement.Name.LocalName] = attachmentElement.Value;
}
dr["AttachmentID"] = _StateRequestRecordGUID;
SqlCommandBuilder sb2 = new SqlCommandBuilder(da2);
da2.Update(dt2);
if (conn5 != null) { conn5.Close(); }
}
if (dr.Table.Columns.Contains(element.Name.LocalName))
{
dr[element.Name.LocalName] = element.Value;
}
}
SqlCommandBuilder sb = new SqlCommandBuilder(da);
da.Update(dt);
if (conn4 != null) { conn4.Close(); }
}
Give this a shot:
var duplicates = (from req in doc.Descendants(ns + "StateSeparationRequest")
group req by req.Descendants(ns + "StateRequestRecordGUID").First().Value
into g
where g.Count() > 1
select g.Skip(1)).SelectMany( elements => elements );
foreach (var duplicate in duplicates)
{
duplicate.Remove();
}
What this query essentially does is:
Group StateSeparationRequest elements by unique StateRequestRecordGUID values. For each group having more than one matching StatateSeparationRequest, select all but the first one. What you are left with is the list of duplicates, which you can iterate over, and remove.
精彩评论