开发者

Create CSV file faster

I create a CSV file by getting a datatable then looping through that datatable and writing eachline of a CSV file. My data source usually has around 65,000 rows in it. This process takes several minutes to download from the browser. The problem is that locally and on dev it doesn't take too long, but on the client they usually timeout.

Is there a faster way to generate this file?

Function GenerateCSVFile() As String
    Dim stuPro As New studentProvider.StudentProvider
    Dim emailCenter As New EmailCenter
    Dim strFileName As String = System.IO.Path.GetRandomFileName().Replace(".", "")
    Dim strResult As String = ""

    Dim dtStudent As Data.DataTable
    Dim paymentYear As String = ""
    dtStudent = stuPro.generateDataFile()

    If dtStudent.Rows.Count > 0 Then

        Using sw As New System.IO.StreamWriter(Server.MapPath("Temp/" + strFileName + ".csv"))
            Try
                Dim lineValue As String = ""

                lineValue += "Academic Year, StudentID, SSN, First, Middle, Last"

                sw.WriteLine(lineValue)

                For i As Integer = 0 To dtStudent.Rows.Count - 1

                    lineValue = dtStudent.Rows(i)("fy").ToString
                    lineValue += "," & dtStudent.Rows(i)("uniq_stu_id").ToString
                    lineValue += "," & dtStudent.Rows(i)("ssn").ToString
                    lineValue += "," & dtStudent.Rows(i)("fname").ToString
                    lineValue += "," & dtStudent.Rows(i)("mname").ToString
                    lineValue += "," & dtStudent.Rows(i)("lname").ToString
                    sw.WriteLine(lineValue)

                Next
            Catch ex As Exception
                strResult += ex.ToString
            Finally
                sw.Close()
            End Try

        End Using

        Dim strFriendlyName As String = Date.Now.ToString("MM-dd-yyyy") & ".csv"

        If String.IsNullOrEmpty(strResult) Then

            Dim fs As System.IO.FileStream = Nothing

            fs = System.IO.File.Open(Server.MapPath("Temp/" + strFileName + ".csv"), System.IO.FileMode.Open)
            Dim btFile(fs.Length) As Byte
            fs.Read(btFile, 0, fs.Length)
            fs.Close()

            With Response
                .AddHeader("Content-disposition", "attachment;filename=" & strFriendlyName)
                .ContentType = "application/octet-stream"
                .BinaryWrite(btFile)
                .End()
            End With
        End If
    Else
        strResult = "No records found for specified academic year"
    End If

    Return strResult
End Function

Updated Code

Function GenerateCSVFile() As String
    Dim startDate As Date = Date.Now
    Dim enddate As Date = Nothing
    Dim stuPro As New studentProvider.StudentProvider
    Dim emailCenter As New EmailCenter
    Dim strFileName As String = System.IO.Path.GetRandomFileName().Replace(".", "")
    Dim strResult As String = ""

    Dim dtStudent As Data.DataTable
    Dim paymentYear As String = ""
    dtStudent = stuPro.generateDataFile(Session("VendorID"), txtAcademicYear.Text.Trim)

    If dtStudent.Rows.Count > 0 Then

        With Response

            Dim strFriendlyName As String = Date.Now.ToString("MM-dd-yyyy") & ".csv"
            .AddHeader("Content-disposition", "attachment;filename=" & strFriendlyName)
            .ContentType = "application/octet-stream"

            Dim lineValue As StringBuilder = New StringBuilder

            lineValue.Append("Academic Year, StudentID, SSN, First, Middle, Last")

            .Write(lineValue.ToString)

            For i As Integer = 0 To dtStudent.Rows.Count - 1

                lineValue = New StringBuilder
                lineValue.Append(dtStudent.Rows(i)("fy").ToString)
                lineValue.Append("," & dtStudent.Rows(i)("uniq_stu_id").ToString)
                lineValue.Append("," & dtStudent.Rows(i)("ssn").ToString)
                lineValue.Append("," & dtStudent.开发者_开发问答Rows(i)("fname").ToString)
                lineValue.Append("," & dtStudent.Rows(i)("mname").ToString)
                lineValue.Append("," & dtStudent.Rows(i)("lname").ToString)

                .Write(lineValue.ToString)

            Next
            enddate = Date.Now

            MsgBox(DateDiff(DateInterval.Second, startDate, enddate))

            .End()
        End With
    Else
        strResult = "No records found for specified academic year"
    End If
    Return strResult
End Function


You are writing to a temp file, reading that file in, and they writing the contents of that file into the response. Skip the temp file and write directly to the response one line at a time. That will keep the browser from thinking your server is timing out, make things faster, and reduce the amount of memory consumed by your app.

After that, look into how to enable caching on this web request so that ASP.NET won't have to recreate the CSV if multiple users ask for it in a short period of time.


There are some options to speed this up:

  • Don't write from StreamWriter to file to Response, write to the pages Response directly.
  • Look for a solution using a datareader instead of datatable to loop through the data, with this much data it might be faster. It also will lower memory usage (does not load whole table in memory).
  • If concatenating many strings use a StringBuilder, or in this case use String.Join to easily create the whole line.

String.Join example:

For Each row in dtStudent.Rows
    Dim line as new List(of String)
    line.Add(row("fy").ToString)
    line.Add(row("uniq_stu_id").ToString)
    line.Add(-etc-)

    Response.Write(String.Join(",", line.ToArray) & vbcrlf )
Next


You should be using StringBuilder instead of concatenation.


In addition to @Robert Levy's suggestions, be careful how you use string variables. You would be better served using a stringbuilder on these lines:

        dim sbTemp as new StringBuilder()
        For i As Integer = 0 To dtStudent.Rows.Count - 1

            sbTemp.Append(dtStudent.Rows(i)("fy").ToString)
            sbTemp.Append(",") 
            sbTemp.Append(dtStudent.Rows(i)("uniq_stu_id").ToString)

            'etc
            sw.WriteLine(lineValue)

        Next


One thing that you could look into is a producer/consumer design pattern. what this could allow you to do is have one (or more) thread(s) feeding a queue containing the data that needs to be written into the csv file and another thread (or more than one) that does the actual writing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜