group duplicates with counts
I need to go through a li开发者_JS百科st of strings and count the number of duplicates and then print the string with the number of occurrences in one line to a file. Here is what I have but I need to only print the string once and its count.
Do
line = LineInput(1)
Trim(line)
If line = temp Then
counter += 1
Else
counter = 1
End If
temp = line
swriter.WriteLine(line & " " & counter.ToString)
swriter.Flush()
Loop While Not EOF(1)
My brain is just not functioning today..
You could also use LINQ:
Dim dups = From x In IO.File.ReadAllLines("TextFile1.txt") _
Group By line Into Group _
Where Group.Count > 1 _
Let count = Group.Count() _
Order By count Descending _
Select New With { _
Key .Value = x, _
Key .Count = count _
}
For Each d In dups
swriter.WriteLine(String.Format("duplicate: {0} count: {1}", d.Value, d.Count))
swriter.Flush()
Next
You should probably use something like a Dictionary to count the strings.
Dim dict As New Dictionary(Of String, Integer)
Do
line = LineInput(1)
line = Trim(line)
If dict.ContainsKey(line) Then
dict(line) += 1
Else
dict.Add(line, 1)
End If
Loop While Not EOF(1)
And then print out the elements in the dictionary
For Each line As String In dict.Keys
swriter.WriteLine(line & " " & dict(line))
swriter.Flush()
Next
精彩评论