How to a split a list in parts that have size less than 1MByte
I have a sorted list of dictionaries returned by a remote API call (tipically the response is less than 4 MByte.
开发者_如何学GoI would like to split this list in chunks where the MAX allowed size of the resulted single chunk is 1 MByte.*The resulted list of chunks need to preserve the initial sorting; these chunks then will be serialized (via Pickle) and put into different Blob field having 1 MByte MAX size.
What's the fastest code to achieve that with Python 2.5?
*the number of chunks should be the lowest that fits into the 1MByte constraint
Following up on my comment. You could use this extension. And the following script. Assume that this won't optimize the size of the chunks. It only assures that none of them are larger than MAX
from sizeof import asizeof
matrix=[]
new_chunk = []
size_of_current_chunk = 0
for x in your_sorted_list:
s = asize(x)
if size_of_current_chunk + s > MAX:
matrix.append(new_chunk)
size_of_current_chunk = 0
new_chunk = []
size_of_chunk += s
new_chunk.append(x)
if len(new_chunk):
matrix.append(new_chunk)
the element matrix
would contain lists of objects with less than MAX bytes in each of them.
It'd be interesting to measure the performance of asize against just encoding the objects in a json string and multiplying the json string by sizeof(char).
I found pympler library, the asizeof
module provides basic size information for one or several Python objects tested with Python 2.2.3, 2.3.7, 2.4.5, 2.5.1, 2.5.2, 2.6.
精彩评论