Efficient variable byte iteration over a string in Python

2023-02-26 05:55 问答作者：

I'm reading a large (500MB) binary file in Python and parsing it byte by byte into a Python data structure. This file represents a sparse data grid. Depending on the format sometimes I need to read one byte, two bytes, or four bytes at a time. For bureaucratic reasons, I'm required to do this in Python rather than C.

I'm looking for runtime efficient mechanisms to do this in Python. Below is a simplified example of what I'm doing now:

with open(filename,'rb') as inFile:
 nCoords = struct.unpack('!i',inFile.read(4))[0]
 for i in range(nCoords):
    coord = (struct.unpack_from('!h',inFile.read(2))[0],struct.unpack_from('!h',inFile.read(2))[0]) # x, y coord
    nCrops = struct.unpack_from('!B',inFile.read(1))[0] #n crops
    for j in range(nCrops):
        cropId = struct.unpack_from('!B',inFile.read(1))[0] #cropId

I'm wondering if loading the file from disk into a string, and parsing out the string would be more e开发者_开发知识库fficient than reading a few bytes at a time. Something like:

with open(filename,'rb') as inFile:
   wholeFile = inFile.read()

But I doubt that using array splicing on wholeFile will be more efficient than what I'm already doing.

Is there a runtime efficient mechanism in Python to read a file into a string, then iterate over it a few bytes at a time? (I've checked out StringIO and it only allows reading a line at a time, not what I want in this case since the whole file is one line).

mmap

继续阅读：file-io python

Efficient variable byte iteration over a string in Python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？