Python (or C) library for creating XLSX documents that can handle millions of rows [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered 开发者_运维百科with facts and citations.
Closed 3 years ago.
Improve this questionI'm looking for a library to create XLSX files which can contain upwards of a million rows, and several dozen columns. So far all the libraries I have found in Python consume too much memory, and I haven't found a suitable library to wrap in C. I'd prefer open source so I can modify the code if need be.
EDIT: I have found a solution. openpyxl has an "Optimized Writer": http://packages.python.org/openpyxl/optimized.html
have you tried ElementTree? if that uses too much memory, use SAX and just process a row at a time. XML parsing - ElementTree vs SAX and DOM
The XLSX format consists of a number of XML files that have been zipped. If the format of the output will not be changing, it would be trivial to use an existing file as a template and simply add rows to it as necessary. Unfortunately ZipFile.writestr
doesn't allow you to write the file in pieces, so you'll have to write the entire XML to a temporary file then place that into the zip with ZipFile.write
.
精彩评论