decoding, little endian, uncompressed and float array
I have Base64 encoded data from an experiment. So what I am trying to do in开发者_运维问答 stepwise is:
- Retrieve bytes from base64 encoding (Decode it)
- Convert bytes to little-endian
- Decompress bytes from (zlib)
- Convert byte array to float array
Example:
Dn3LQ3np4kOyxQJE20kDRBRuFkScZB5ENxEzRFa+O0THMz9EOQRCRFC1QkRYeUNEwXJJROfbSUScvE5EVDtVRK5PV0TLUWNE481lRHX7ZkSBBWpE9FVyRIFdeESkoHhEnid8RI1nfUSy4YBE/C2CRGKQg0RcR4RE54uEROUAhUTBWodErKyMRNsVkkRvUpJEukWURO58lkSqRZ1E2VauRPBTwEQf9cVE9BnKRA==
What I have tried so far
import os
import base64
import struct
s = 'Dn3LQ3np4kOyxQJE20kDRBRuFkScZB5ENxEzRFa+O0THMz9EOQRCRFC1QkRYeUNEwXJJROfbSUScvE5EVDtVRK5PV0TLUWNE481lRHX7ZkSBBWpE9FVyRIFdeESkoHhEnid8RI1nfUSy4YBE/C2CRGKQg0RcR4RE54uEROUAhUTBWodErKyMRNsVkkRvUpJEukWURO58lkSqRZ1E2VauRPBTwEQf9cVE9BnKRA=='
decode=base64.decodestring(s)
tmp_size=len(decode)/4
Now I am trying to convert these byte to little endian from here.
I want to do the next operation in Python.
I am trying to figure it out myself but, it is taking too much time.
Thanks!
It appears your data isn't actually compressed. Read the data as floats either in a loop using struct.unpack_from()
or as one big structure using struct.unpack()
.
import base64
import struct
encoded = 'Dn3LQ3np ... 9BnKRA=='
# decode the string
data = base64.standard_b64decode(encoded)
# ensure that there's enough data for 32-bit floats
assert len(data) % 4 == 0
# determine how many floats there are
count = len(data) // 4
# unpack the data as floats
result = struct.unpack('<{0}f'.format(count), # one big structure of `count` floats
data) # results returned as a tuple
If the data is compressed, decompress it.
import zlib
decompressed = zlib.decompress(data)
Convert bytes to little-endian
Byte ordering only applies to data types that are greater than 1 byte. So you can't just convert a list of bytes to little-endian. You need to understand what is in your list of bytes.
A 32-bit integer is 4 bytes; If you have 16 bytes of data. You could "unpack" that into 4 32-bit integers.
If the data is just ascii text the endianness doesn't matter, that's why you can read the exact same ascii text file on both big-endian and little-endian machines.
Here is an example demonstrating struct.pack and struct.unpack:
#!/usr/bin/env python2.7
import struct
# 32-bit unsigned integer
# base 10 2,864,434,397
# base 16 0xAABBCCDD
u32 = 0xAABBCCDD
print 'u32 =', u32, '(0x%x)' % u32
# big endian 0xAA 0xBB 0xCC 0xDD
u32be = struct.pack('>I', u32)
bx = [byte for byte in struct.unpack('4B', u32be)]
print 'big endian packed', ['0x%02x' % x for x in bx]
assert bx == [0xaa, 0xbb, 0xcc, 0xdd]
# little endian 0xDD 0xCC 0xBB 0xAA
u32le = struct.pack('<I', u32)
lx = [byte for byte in struct.unpack('4B', u32le)]
print 'little endian packed', ['0x%02x' % x for x in lx]
assert lx == [0xdd, 0xcc, 0xbb, 0xaa]
# 64-bit unsigned integer
# base 10 12,302,652,060,662,200,000
# base 16 0xAABBCCDDEEFF0011
u64 = 0xAABBCCDDEEFF0011L
print 'u64 =', u64, '(0x%x)' % u64
# big endian 0xAA 0xBB 0xCC 0xDD 0xEE 0xFF 0x00 0x11
u64be = struct.pack('>Q', u64)
bx = [byte for byte in struct.unpack('8B', u64be)]
print 'big endian packed', ['0x%02x' % x for x in bx]
assert bx == [0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff, 0x00, 0x11]
# little endian 0x11 0x00 0xFF 0xEE 0xDD 0xCC 0xBB 0xAA
u64le = struct.pack('<Q', u64)
lx = [byte for byte in struct.unpack('8B', u64le)]
print 'little endian packed', ['0x%02x' % x for x in lx]
assert lx == [0x11, 0x00, 0xff, 0xee, 0xdd, 0xcc, 0xbb, 0xaa]
check out the documentation for more info: http://docs.python.org/library/struct.html#format-strings
Looks like your next step will be to use struct
. Something like this:
struct.unpack("<f", decode[0:4])
This example will turn the first four bytes of decode
into a float. Check out the struct documentation for more info on format strings, etc.
精彩评论