LINQ VB how to check for duplicates in a list of objects
I have a list of objects, each with 2 relevant开发者_高级运维 properties: "ID" and "Name". Lets call the list "lstOutcomes".
I need to check the list for duplicates (meaning object1.ID = object2.ID
, etc.) and set a flag (valid = false
, or something) if there is at least one duplicate. Also, it would be nice to send a message to the user mentioning the "Name" of the object, when it fails.
I am sure I will need to use the Group By
operator to do this, but I am not used to doing that in LINQ, and the examples out there are just not helping me. This article seems to be close to what i need, but not quite and it's in C#.
Here is a starting stab at it...
Dim duplist = _
(From o As objectType In lstOutcomes _
Group o By o.ID Into g = Group _
Let dups = g.Where(Function(h) g.Count > 1) _
Order By dups Descending).ToArray
if duplist.count > 0 then
valid = false
end if
help?
I'll write it in C#, but hope you could convert it to VB. It does not use join and is O(n log n), and I assumed you have List<T>
:
lst.Sort(); //O(nlogn) part.
var duplicatedItems = lst.Skip(1).Where((x,index)=>x.ID == lst[index].ID);
Dim itemsGroupedByID = lstOutcomes.GroupBy(Function(x) x.ID)
Dim duplicateItems = itemsGroupedByID.Where(Function(x) x.Count > 1) _
.SelectMany(Function(x) x) _
.ToList()
If duplicateItems.Count > 0
valid = False
Dim errorMessage = "The following items have a duplicate ID: " & _
String.Join(", ", duplicateItems.Select(Function(x) x.Name))
End If
I'll take back what Saeed Amiri said in C# and complete it.
lst.Sort()
Dim valid As Boolean = true
dim duplicatedItems = lst.Skip(1) _
.Where(Function(x,index) x.ID = lst(index).ID)
Dim count As Integer = duplicatedItems.Count()
For Each item As objectType In duplicatedItems
valid = False
Console.WriteLine("id: " & item.ID & "Name: " & item.Name)
Next
The project is behind, I just hacked it together like this:
' For each outcome, if it is in the list of valid outcomes more than once, and it is not in the list of
' duplicates, add it to the duplicates list.
Dim lstDuplicates As New List(Of objectType)
For Each outcome As objectType In lstOutcomes
'declare a stable outcome variable
Dim loutcome As objectType = outcome
If lstOutcomes.Where(Function(o) o.ID = loutcome.ID).Count > 1 _
AndAlso Not lstDuplicates.Where(Function(d) d.ID = loutcome.ID).Count > 0 Then
lstDuplicates.Add(outcome)
End If
Next
If lstDuplicates.Count > 0 Then
valid = False
sbErrors.Append("There cannot be multiple outcomes of any kind. The following " & lstDuplicates.Count & _
" outcomes are duplicates: ")
For Each dup As objectType In lstDuplicates
sbErrors.Append("""" & dup.Name & """" & " ")
Next
sbErrors.Append("." & vbNewLine)
End If
It is late, but though it could help others.
You can achieve this with a pair of very clean one-liners:
Dim lstOutcomes As IList(Of T)
Dim FoundDuplicates As Boolean
FoundDuplicates = lstOutcomes.Any(Function(p) lstOutcomes.ToArray.Count(Function(q) p.ID = q.ID and p.Name=q.Name) > 1)
Dim ListOfDuplicates As IList(Of T)
ListOfDuplicates = lstOutcomes.Where(Function(p) lstOutcomes.ToArray.Count(Function(q) p.ID = q.ID And p.Name = q.Name) > 1)
Then you can clean the list of duplicates so that it contains the duplicate only once:
Dim CleanList as IList(of T)
For Each MyDuplicate As T in ListOfDuplicates
If not CleanList.Any(function(p) p.ID = MyDuplicate.ID And p.Name = MyDuplicate.Name) then
CleanList.Add(MyDuplicate)
End If
Next
Or as a one-liner, although it does not read as nicely
ListOfDuplicates.ForEach(sub(p) If not CleanList.Any(function(q) p.ID = q.ID And p.Name = q.Name) then CleanList.Add(p))
Finally, as an anticipation of future requirements, you should define "what a duplicate is" as a separate thing. A delegate is quite convenient for this:
Dim AreDuplicates as Func(of T, T, Boolean) = Function(a,b) a.ID = b.ID And a.Name = b.Name
FoundDuplicates = lstOutcomes.Any(Function(p) lstOutcomes.ToArray.Count(Function(q) AreDuplicates(p,q) ) > 1)
精彩评论