VB.net Memory efficient function needed
I'm getting out of memory exceptions from the following function when RowCollection is 50000+ and thus i need to make it more memory efficient. The function is simply needs to construct a comma separated string of the row indexes stored in RowCollection. Can anyone spot any obvious memory hungry operations in the following?
N.B RowCollection just contains a list of row indexes stored as integers.
Private Function GetCommaSeparatedString(ByRef RowIndexes As ArrayList) As String
Dim RowString As String = String.Empty
'Build a string of the row indexes
'Add one onto each index value so our indexes begin at 1
For Each Row In RowIndexes
RowString += CInt(Row.ToString) + 1 & ","
Next
开发者_开发知识库 'Remove the last comma
If RowString.Length > 0 Then
RowString = RowString.Substring(0, RowString.Length - 1)
End If
Return RowString
End Function
Thanks in advance.
I'm not sure why you're getting out of memory errors, unless the string representation of your rows is extremely large, because you never have more than one or two non-garbage-collectible strings.
However, your method is horribly inefficient because it spends so much time copying the contents of half-built strings. A StringBuilder is more appropriate when building large strings, because it can be modified without re-creating the contents each time.
HOWEVER, in this case even a StringBuilder is a bad idea, because you are joining strings and there is already a method to do that: String.Join. Just use a LINQ query to do the add-one-to-index-stuff and you get a one-liner:
Private Function GetCommaSeparatedString(ByVal RowIndexes As ArrayList) As String
Return String.Join(",", From index In RowIndexes Select CInt(index) + 1)
End Function
I would also recommend not passing by reference unless you actually need it. You aren't modifying RowIndexes, so pass it by value. I'm also not sure why you are ToString()-ing the index then immediately parsing it. Aren't they already integers? Just use CInt.
Update: while this is a straight change to use stringbuilder, look at the better approaches by Strilanc or Steven Sudit
Well, you may still run out of memory (memory is finite, after all), but you should be using a StringBuilder, not concatenating strings. Each time, you are creating a new string object rather than changing it (As strings are immutable)
Private Function GetCommaSeparatedString(ByRef RowIndexes As ArrayList) As String
Dim RowString As New StringBuilder()
'Build a string of the row indexes
'Add one onto each index value so our indexes begin at 1
For Each Row In RowIndexes
RowString.AppendFormat("{0},", CInt(Row.ToString) + 1)
Next
'Remove the last comma
If RowString.Length > 0 Then
RowString.Append(RowString.Substring(0, RowString.Length - 1))
End If
Return RowString
End Function
StringBuilder
is a good idea, but why not just avoid the problem by streaming the output out instead of trying to hold it all in memory at once?
This is because in each iteration, behind the scenes you are creating 2 strings, and they are getting big nearer the end.
"1,2,3,4,5,....499,500" "1,2,3,4,5,....499,500,"
at the end of only 500 iterations, you are creating 2 strings nearly 2000 characters long, only to have them discarded in the next iteration (but the runtime may be keeping them around).
In the last iteration, your string (from 1 to 50000) would be 100,000 characters long, assuming your row indexes are even sequential. This would mean you have allocated ~ 10,000,000,000 characters or (I believe 2 bytes/char) 20 gigabytes of strings.
You can start by using StringBuilder
instead of +=
on a string (RowString).
Ex
Dim RowString As StringBuilder = new StringBuilder( 100000 )
For Each Row In RowIndexes
RowString.Append( CInt(Row.ToString) + 1).Append( "," )
Next
'...'
Return RowString.ToString
You can try the next one as well, but you should profile the two and pick the best for you.
Private Function GetCommaSeperatedString(ByRef RowIndexes As ArrayList) As String
Dim indexArray as String[] = RowIndexes
.Select(Function(r)=> (CInt(r.ToString) + 1)ToString)
.ToArray
return String.Join( ',', indexArray)
End Function
* note: these are the first lines of VB I've ever written, so I may have made a basic mistake (especially in the linq/lambda stuff), but the point is there.
精彩评论