How to read a file byte by byte in Python and how to print a bytelist as a binary?
I'm trying to read a file byte by byte, but I'm not sure how to do that. I'm trying to do it like that:
file = open(filename, 'rb')
while 1:
byte = file.read(8)
# Do something...
So does that make the variable byte to contain 8 next bits at the beginning of every loop? It doesn't matter what those bytes really are. The only thing that matters is that I need to read a file in 8-bit stacks.
EDIT:
Also I collect t开发者_高级运维hose bytes in a list and I would like to print them so that they don't print out as ASCII characters, but as raw bytes i.e. when I print that bytelist it gives the result as
['10010101', '00011100', .... ]
To read one byte:
file.read(1)
8 bits is one byte.
To answer the second part of your question, to convert to binary you can use a format string and the ord
function:
>>> byte = 'a'
>>> '{0:08b}'.format(ord(byte))
'01100001'
Note that the format pads with the right number of leading zeros, which seems to be your requirement. This method needs Python 2.6 or later.
The code you've shown will read 8 bytes. You could use
with open(filename, 'rb') as f:
while 1:
byte_s = f.read(1)
if not byte_s:
break
byte = byte_s[0]
...
There's a python module especially made for reading and writing to and from binary encoded data called 'struct'. Since versions of Python under 2.6 doesn't support str.format, a custom method needs to be used to create binary formatted strings.
import struct
# binary string
def bstr(n): # n in range 0-255
return ''.join([str(n >> x & 1) for x in (7,6,5,4,3,2,1,0)])
# read file into an array of binary formatted strings.
def read_binary(path):
f = open(path,'rb')
binlist = []
while True:
bin = struct.unpack('B',f.read(1))[0] # B stands for unsigned char (8 bits)
if not bin:
break
strBin = bstr(bin)
binlist.append(strBin)
return binlist
Late to the party, but this may help anyone looking for a quick solution:
you can use bin(ord('b')).replace('b', '')
bin() it gives you the binary representation with a 'b' after the last bit, you have to remove it. Also ord() gives you the ASCII number to the char or 8-bit/1 Byte coded character.
Cheers
精彩评论