开发者

Python method for storing list of bytes in network (big-endian) byte order to file (little-endian)

My present task is to dissect tcpdump data that includes P2P messages and I am having trouble with the piece data I acquire and write to a file on my x86 machine. My suspicion is I have a simple endian-ness issue with the bytes I write to to file.

I have a list of bytes holding a piece of P2P video read and processed using python-pcapy package BTW.

bytes = [14, 254, 23, 35, 34, 67, etc... ]

I am looking for a way to store these bytes, presently held in a list in my Python application to a file.

Currently I write the pieces as follows:

def writePiece(self, filename, pieceindex, bytes, ipsrc, ipdst, ts): 
    file = open(filename,"ab")
    # Iterate through bytes writing them to a file if don't have piece already 
    if not self.piecemap[ipdst].has_key(pieceindex):
        for byte in bytes: 
            file.write('%c' % byte)
        file.flush()
        self.procLog.info("Wrote (%d) bytes of piece (%d) to %s" % (len(bytes), pieceindex, filename))

    # Remember we have this piece now in case duplicates arrive 
    self.piecemap[ipdst][pieceindex] = True

    # TODO: Collect stats 
    file.close()

As you can see from the for loop, I write the bytes to the file in the same order as I process them from the wire (i.e. network or big-endian order).

Suffice to say, the video which is the payload of the pieces does not playback well in VLC :-D

I think I need to convert them to little-endian byte order but am not sure the best way to approach th开发者_运维百科is in Python.

UPDATE

The solution that worked out for me (writing P2P pieces handling endian issues appropriately) was:

def writePiece(self, filename, pieceindex, bytes, ipsrc, ipdst, ts): 
    file = open(filename,"r+b")
    if not self.piecemap[ipdst].has_key(pieceindex):
        little = struct.pack('<'+'B'*len(bytes), *bytes) 
        # Seek to offset based on piece index 
        file.seek(pieceindex * self.piecesize)
        file.write(little)
        file.flush()
        self.procLog.info("Wrote (%d) bytes of piece (%d) to %s" % (len(bytes), pieceindex, filename))

    # Remember we have this piece now in case duplicates arrive 
    self.piecemap[ipdst][pieceindex] = True

    file.close()

The key to the solution was usage of Python struct module as suspected and in particular:

    little = struct.pack('<'+'B'*len(bytes), *bytes) 

Thanks to those who responded with helpful suggestions.


To save yourself some work you might like to use a bytearray (Python 2.6 and later):

b = [14, 254, 23, 35]
f = open("file", 'ab')
f.write(bytearray(b))

This does all the converting of your 0-255 values into bytes without the need for all the looping.

I can't see what your problem is otherwise without more information. If the data really is byte-wise then endianness isn't an issue, as others have said.

(By the way, using bytes and file as variable names isn't good as it hide the built-ins of the same name).


You can also use an array.array:

from array import array
f.write(array('B', bytes))

instead of

f.write(struct.pack('<'+'B'*len(bytes), *bytes))

which when tidied up a little is

f.write(struct.pack('B' * len(bytes), *bytes))
# the < is redundant; there is NO ENDIANNESS ISSUE

which if len(bytes) is "large" might be better as

f.write(struct.pack('%dB' % len(bytes), *bytes)) 


This may have been answered previously in Python File Slurp w/ endian conversion.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜