Working with bytes and binary data in Python
Four consecutive bytes in a byte string together specify some value. However, only 7 bits in each byte are used; the most significant bit is always zero and therefore its ignored (that makes 28 bits altogether). So...
b"\x00\x00\x02\x01"
would be 000 0000
000 0000
000 0010
000 0001
.
Or, for the sake of legibility, 10 000 0001
. That's the value the four bytes represent. But I want a decimal, so I do this:
>>> 0b开发者_Go百科100000001
257
I can work all that out myself, but how would I incorporate it into a program?
Use bitshifting and addition:
bytes = b"\x00\x00\x02\x01"
i = 0
for b in bytes:
i <<= 7
i += b # Or use (b & 0x7f) if the last bit might not be zero.
print(i)
Result:
257
Using the bitarray module, you can do it a lot quicker for big numbers:
Benchmarks (factor 2.4x speedup!):
janus@Zeus /tmp % python3 -m timeit -s "import tst" "tst.tst(10000)"
10 loops, best of 3: 251 msec per loop
janus@Zeus /tmp % python3 -m timeit -s "import tst" "tst.tst(100)"
1000 loops, best of 3: 700 usec per loop
janus@Zeus /tmp % python3 -m timeit -s "import sevenbittoint, os" "sevenbittoint.sevenbittoint(os.urandom(10000))"
10 loops, best of 3: 73.7 msec per loop
janus@Zeus /tmp % python3 -m timeit -s "import quick, os" "quick.quick(os.urandom(10000))"
10 loops, best of 3: 179 msec per loop
quick.py (from Mark Byers):
def quick(bites):
i = 0
for b in bites:
i <<= 7
i += (b & 0x7f)
#i += b
return i
sevenbittoint.py:
import bitarray
import functools
def inttobitarray(x):
a = bitarray.bitarray()
a.frombytes(x.to_bytes(1,'big'))
return a
def concatter(accumulator,thisitem):
thisitem.pop(0)
for i in thisitem.tolist():
accumulator.append(i)
return accumulator
def sevenbittoint(bajts):
concatted = functools.reduce(concatter, map(inttobitarray, bajts), bitarray.bitarray())
missingbits = 8 - len(concatted) % 8
for i in range(missingbits): concatted.insert(0,0) # zeropad
return int.from_bytes(concatted.tobytes(), byteorder='big')
def tst():
num = 32768
print(bin(num))
print(sevenbittoint(num.to_bytes(2,'big')))
if __name__ == "__main__":
tst()
tst.py:
import os
import quick
import sevenbittoint
def tst(sz):
bajts = os.urandom(sz)
#for i in range(pow(2,16)):
# if i % pow(2,12) == 0: print(i)
# bajts = i.to_bytes(2, 'big')
a = quick.quick(bajts)
b = sevenbittoint.sevenbittoint(bajts)
if a != b: raise Exception((i, bin(int.from_bytes(bajts,'big')), a, b))
精彩评论