Unpack signed little-endian in Ruby
So I'm working on some MongoDB protocol stuff. All integers are signed little-endian. Using Ruby's standard Array#pack
method, I can convert from an integer to the binary string I want just fine:
positive_one = Array(1).pack('V') #=> '\x01\x00\x00\x00'
negative_one = Array(-1).pack('V') #=> '\xFF\xFF\xFF\xFF'
However, going the other way, the String#unpack
method has the 'V' format documented as specifically returning unsigned integers:
positive_one.unpack('V').first #=> 1
negative_one.unpack('V').first #=> 4开发者_JAVA百科294967295
There's no formatter for signed little-endian byte order. I'm sure I could play games with bit-shifting, or write my own byte-mangling method that doesn't use array packing, but I'm wondering if anyone else has run into this and found a simple solution. Thanks very much.
After unpacking with "V"
, you can apply the following conversion
class Integer
def to_signed_32bit
if self & 0x8000_0000 == 0x8000_0000
self - 0x1_0000_0000
else
self
end
end
end
You'll need to change the magic constants 0x1_0000_0000
(which is 2**32
) and 0x8000_0000
(2**31
) if you're dealing with other sizes of integers.
Edit I misunderstood the direction you were converting originally (according to the comment). But after thinking about it some, I believe the solution is still the same. Here is the updated method. It does the exact same thing, but the comments should explain the result:
def convertLEToNative( num )
# Convert a given 4 byte integer from little-endian to the running
# machine's native endianess. The pack('V') operation takes the
# given number and converts it to little-endian (which means that
# if the machine is little endian, no conversion occurs). On a
# big-endian machine, the pack('V') will swap the bytes because
# that's what it has to do to convert from big to little endian.
# Since the number is already little endian, the swap has the
# opposite effect (converting from little-endian to big-endian),
# which is what we want. In both cases, the unpack('l') just
# produces a signed integer from those bytes, in the machine's
# native endianess.
Array(num).pack('V').unpack('l')
end
Probably not the cleanest, but this will convert the byte array.
def convertLEBytesToNative( bytes )
if ( [1].pack('V').unpack('l').first == 1 )
# machine is already little endian
bytes.unpack('l')
else
# machine is big endian
convertLEToNative( Array(bytes.unpack('l')))
end
end
This question has a method for converting signed to unsigned that might be helpful. It also has a pointer to the bindata gem which looks like it will do what you want.
BinData::Int16le.read("\000\f") # 3072
[edited to remove the not-quite-right s unpack directive]
For the sake of posterity, here's the method I eventually came up with before spotting Paul Rubel's link to the "classical method". It's kludgy and based on string manipulation, so I'll probably scrap it, but it does work, so someone might find it interesting for some other reason someday:
# Returns an integer from the given little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
bits = str.reverse.unpack('B*').first # Get the 0s and 1s
if bits[0] == '0' # We're a positive number; life is easy
bits.to_i(2)
else # Get the twos complement
comp, flip = "", false
bits.reverse.each_char do |bit|
comp << (flip ? bit.tr('10','01') : bit)
flip = true if !flip && bit == '1'
end
("-" + comp.reverse).to_i(2)
end
end
UPDATE: Here's the simpler refactoring, using a generalized arbitrary-length form of Ken Bloom's answer:
# Returns an integer from the given arbitrary length little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
arr, bits, num = str.unpack('V*'), 0, 0
arr.each do |int|
num += int << bits
bits += 32
end
num >= 2**(bits-1) ? num - 2**bits : num # Convert from unsigned to signed
end
精彩评论