开发者

VB.net Memory efficient function needed

I'm getting out of memory exceptions from the following function when RowCollection is 50000+ and thus i need to make it more memory efficient. The function is simply needs to construct a comma separated string of the row indexes stored in RowCollection. Can anyone spot any obvious memory hungry operations in the following?

N.B RowCollection just contains a list of row indexes stored as integers.

 Private Function GetCommaSeparatedString(ByRef RowIndexes As ArrayList) As String
        Dim RowString As String = String.Empty

        'Build a string of the row indexes 
        'Add one onto each index value so our indexes begin at 1 
        For Each Row In RowIndexes 
            RowString += CInt(Row.ToString) + 1 & ","
        Next

   开发者_开发知识库     'Remove the last comma
        If RowString.Length > 0 Then
            RowString = RowString.Substring(0, RowString.Length - 1)
        End If

        Return RowString
    End Function

Thanks in advance.


I'm not sure why you're getting out of memory errors, unless the string representation of your rows is extremely large, because you never have more than one or two non-garbage-collectible strings.

However, your method is horribly inefficient because it spends so much time copying the contents of half-built strings. A StringBuilder is more appropriate when building large strings, because it can be modified without re-creating the contents each time.

HOWEVER, in this case even a StringBuilder is a bad idea, because you are joining strings and there is already a method to do that: String.Join. Just use a LINQ query to do the add-one-to-index-stuff and you get a one-liner:

Private Function GetCommaSeparatedString(ByVal RowIndexes As ArrayList) As String
    Return String.Join(",", From index In RowIndexes Select CInt(index) + 1)
End Function

I would also recommend not passing by reference unless you actually need it. You aren't modifying RowIndexes, so pass it by value. I'm also not sure why you are ToString()-ing the index then immediately parsing it. Aren't they already integers? Just use CInt.


Update: while this is a straight change to use stringbuilder, look at the better approaches by Strilanc or Steven Sudit

Well, you may still run out of memory (memory is finite, after all), but you should be using a StringBuilder, not concatenating strings. Each time, you are creating a new string object rather than changing it (As strings are immutable)

Private Function GetCommaSeparatedString(ByRef RowIndexes As ArrayList) As String
    Dim RowString As New StringBuilder()

    'Build a string of the row indexes 
    'Add one onto each index value so our indexes begin at 1 
    For Each Row In RowIndexes 
        RowString.AppendFormat("{0},",  CInt(Row.ToString) + 1)
    Next

    'Remove the last comma
    If RowString.Length > 0 Then
        RowString.Append(RowString.Substring(0, RowString.Length - 1))
    End If

    Return RowString
End Function


StringBuilder is a good idea, but why not just avoid the problem by streaming the output out instead of trying to hold it all in memory at once?


This is because in each iteration, behind the scenes you are creating 2 strings, and they are getting big nearer the end.

"1,2,3,4,5,....499,500" "1,2,3,4,5,....499,500,"

at the end of only 500 iterations, you are creating 2 strings nearly 2000 characters long, only to have them discarded in the next iteration (but the runtime may be keeping them around).

In the last iteration, your string (from 1 to 50000) would be 100,000 characters long, assuming your row indexes are even sequential. This would mean you have allocated ~ 10,000,000,000 characters or (I believe 2 bytes/char) 20 gigabytes of strings.

You can start by using StringBuilder instead of += on a string (RowString).
Ex

Dim RowString As StringBuilder = new StringBuilder( 100000 )

For Each Row In RowIndexes 
    RowString.Append( CInt(Row.ToString) + 1).Append( "," )
Next

'...'

Return RowString.ToString

You can try the next one as well, but you should profile the two and pick the best for you.

Private Function GetCommaSeperatedString(ByRef RowIndexes As ArrayList) As String
    Dim indexArray as String[] = RowIndexes
                                 .Select(Function(r)=> (CInt(r.ToString) + 1)ToString)
                                 .ToArray
    return String.Join( ',', indexArray)
End Function



* note: these are the first lines of VB I've ever written, so I may have made a basic mistake (especially in the linq/lambda stuff), but the point is there.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜