use of the bitwise operators to pack multiple values in one int
Low level bit manipulation has never been my strong point. I will appreciate some help in understanding the following use case of bitwise operators.Consider...
开发者_运维问答int age, gender, height, packed_info;
. . . // Assign values
// Pack as AAAAAAA G HHHHHHH using shifts and "or"
packed_info = (age << 8) | (gender << 7) | height;
// Unpack with shifts and masking using "and"
height = packed_info & 0x7F; // This constant is binary ...01111111
gender = (packed_info >> 7) & 1;
age = (packed_info >> 8);
I am not sure what this code is accomplishing and how? Why use the magic number 0x7F ? How is the packing and unpacking accomplished?
Source
As the comment says, we're going to pack the age, gender and height into 15 bits, of the format:
AAAAAAAGHHHHHHH
Let's start with this part:
(age << 8)
To start with, age has this format:
age = 00000000AAAAAAA
where each A can be 0 or 1.
<< 8
moves the bits 8 places to the left, and fills in the gaps with zeroes. So you get:
(age << 8) = AAAAAAA00000000
Similarly:
gender = 00000000000000G
(gender << 7) = 0000000G0000000
height = 00000000HHHHHHH
Now we want to combine these into one variable. The |
operator works by looking at each bit, and returning 1 if the bit is 1 in either of the inputs. So:
0011 | 0101 = 0111
If a bit is 0 in one input, then you get the bit from the other input. Looking at (age << 8)
, (gender << 7)
and height
, you'll see that, if a bit is 1 for one of these, it's 0 for the others. So:
packed_info = (age << 8) | (gender << 7) | height = AAAAAAAGHHHHHHH
Now we want to unpack the bits. Let's start with the height. We want to get the last 7 bits, and ignore the first 8. To do this, we use the &
operator, which returns 1 only if both of the input bits are 1. So:
0011 & 0101 = 0001
So:
packed_info = AAAAAAAGHHHHHHH
0x7F = 000000001111111
(packed_info & 0x7F) = 00000000HHHHHHH = height
To get the age, we can just push everything 8 places to the right, and we're left with 0000000AAAAAAAA
. So age = (packed_info >> 8)
.
Finally, to get the gender, we push everything 7 places to the right to get rid of the height. We then only care about the last bit:
packed_info = AAAAAAAGHHHHHHH
(packed_info >> 7) = 0000000AAAAAAAG
1 = 000000000000001
(packed_info >> 7) & 1 = 00000000000000G
This could be a rather long lesson in bit manipulation but first let me point you too the bit masking article on Wikipedia.
packed_info = (age << 8) | (gender << 7) | height;
Take age and move it's value over 8 bits then take gender and move it over 7 bits and height will occupy the last bits.
age = 0b101
gender = 0b1
height = 0b1100
packed_info = 0b10100000000
| 0b00010000000
| 0b00000001100
/* which is */
packed_info = 0b10110001100
Unpacking does the reverse but uses masks like 0x7F (which is 0b 01111111) to trim out the other values in the field.
gender = (packed_info >> 7) & 1;
Would work like...
gender = 0b1011 /* shifted 7 here but still has age on the other side */
& 0b0001
/* which is */
gender = 0b1
Note that ANDing anything to 1 is the same as "keeping" that bit and ANDing with 0 is the same as "ignoring" that bit.
If you were going to store a date as a number, maybe you would accomplish it by multiplying the year by 10000, the month by 100 and adding the day. A date such as July, 2, 2011 would be encoded as the number 20110702:
year * 10000 + month * 100 + day -> yyyymmdd
2011 * 10000 + 7 * 100 + 2 -> 20110702
We can say that we encoded the date in a yyyymmdd mask. We could describe this operation as
- Shift the year 4 positions to the left,
- shift the month 2 positions to the left and
- leave the day as is.
- Then combine the three values together.
This is the same thing that is happenning with the age, gender and height encoding, only that the author is thinking in binary.
See the ranges that those values may have:
age: 0 to 127 years
gender: M or F
height: 0 to 127 inches
If we translate those values to binary, we would have this:
age: 0 to 1111111b (7 binary digits, or bits)
gender: 0 or 1 (1 bit)
height: 0 to 1111111b (7 bits also)
With this in mind, we can encode the age-gender-height data with the mask aaaaaaaghhhhhhh, only that here we are talking about binary digits, not decimal digits.
So,
- Shift the age 8 bits to the left,
- shift the gender 7 bits to the left and
- leave the height as is.
- Then combine all three values together.
In binary, the Shift-Left operator (<<) moves a value n positions to the left. The "Or" operator ("|" in many languages) combines values together. Therefore:
(age << 8) | (gender << 7) | height
Now, how to "decode" those values?
It's easier in binary than in decimal:
- You "mask away" the height,
- shift the gender 7 bits to the right and mask that away also, and finally
- shift the age 8 bits to the right.
The Shift-Right operator (>>) moves a value n positions to the right (whatever digits shifted "out" of the rightmost position are lost). The "And" binary operator ("&" in many languages) masks bits. To do that it needs a mask, indicating which bits to preserve and which bits to destroy (1 bits are preserved). Therefore:
height = value & 1111111b (preserve the 7 rightmost bits)
gender = (value >> 1) & 1 (preserve just one bit)
age = (value >> 8)
Since 1111111b in hex is 0x7f in most languages, that's the reason for that magic number. You would have the same effect by using 127 (which is 1111111b in decimal).
Same requirement I have faced many times. It is very easy with the help of Bitwise AND operator. Just qualify your values with increasing powers of two(2). To store multiple values, ADD their relative number ( power of 2 ) and get the SUM. This SUM will consolidate your selected values. HOW ?
Just do Bitwise AND with every value and it will give zero (0) for values which were not selected and non-zero for which are selected.
Here is the explanation:
1) Values ( YES, NO, MAYBE )
2) Assignment to power of two(2)
YES = 2^0 = 1 = 00000001
NO = 2^1 = 2 = 00000010
MAYBE = 2^2 = 4 = 00000100
3) I choose YES and MAYBE hence SUM:
SUM = 1 + 4 = 5
SUM = 00000001 + 00000100 = 00000101
This value will store both YES as well as MAYBE. HOW?
1 & 5 = 1 ( non zero )
2 & 5 = 0 ( zero )
4 & 5 = 4 ( non zero )
Hence SUM consists of
1 = 2^0 = YES
4 = 2^2 = MAYBE.
For more detailed explanation and implementation visit my blog
A more condense answer:
AAAAAAA G HHHHHHH
Packing:
packed = age << 8 | gender << 7 | height
Alternatively you can just sum components if ie when used in MySQL SUM aggregate function
packed = age << 8 + gender << 7 + height
Unpacking:
age = packed >> 8 // no mask required
gender = packed >> 7 & ((1 << 1) - 1) // applying mask (for gender it is just 1)
height = packed & ((1 << 7) - 1) // applying mask
Another (longer) example:
Say you have an IP address you want to pack, however it is a fictional IP address eg 132.513.151.319. Note that some components greater then 256 which requires more then 8 bits unlike real ip addresses.
First we need to figure out what offset we need to use to be able to store the max number. Lets say with our fictional IPs no component can be bigger then 999 that means we need 10 bits of storage per component (allows numbers up to 1014).
packed = (comp1 << 0 * 10) | (comp1 << 1 * 10) | (comp1 << 2 * 10) | (comp1 << 3 * 10)
Which gives dec 342682502276
or bin 100111111001001011110000000010010000100
Now lets unpack the value
comp1 = (packed >> 0 * 10) & ((1 << 10) - 1) // 132
comp2 = (packed >> 1 * 10) & ((1 << 10) - 1) // 513
comp3 = (packed >> 2 * 10) & ((1 << 10) - 1) // 151
comp4 = (packed >> 3 * 10) & ((1 << 10) - 1) // 319
Where (1 << 10) - 1
is a binary mask we use to hide bits on the left beyond the 10 right most bits we are interested in.
Same example using MySQL query
SELECT
(@offset := 10) AS `No of bits required for each component`,
(@packed := (132 << 0 * @offset) |
(513 << 1 * @offset) |
(151 << 2 * @offset) |
(319 << 3 * @offset)) AS `Packed value (132.513.151.319)`,
BIN(@packed) AS `Packed value (bin)`,
(@packed >> 0 * @offset) & ((1 << @offset) - 1) `Component 1`,
(@packed >> 1 * @offset) & ((1 << @offset) - 1) `Component 2`,
(@packed >> 2 * @offset) & ((1 << @offset) - 1) `Component 3`,
(@packed >> 3 * @offset) & ((1 << @offset) - 1) `Component 4`;
The left shift operator means "multiply by two, this many times". In binary, multiplying a number by two is the same as adding a zero to the right side.
The right shift operator is the reverse of the left shift operator.
The pipe operator is "or", meaning overlay two binary numbers on top of each other, and where there is a 1 in either number the result in that column is a 1.
So, let's extract the operation for packed_info:
// Create age, shifted left 8 times:
// AAAAAAA00000000
age_shifted = age << 8;
// Create gender, shifted left 7 times:
// 0000000G0000000
gender_shifted = gender << 7;
// "Or" them all together:
// AAAAAAA00000000
// 0000000G0000000
// 00000000HHHHHHH
// ---------------
// AAAAAAAGHHHHHHH
packed_info = age_shifted | gender_shifted | height;
And the unpacking is the reverse.
// Grab the lowest 7 bits:
// AAAAAAAGHHHHHHH &
// 000000001111111 =
// 00000000HHHHHHH
height = packed_info & 0x7F;
// right shift the 'height' bits into the bit bucket, and grab the lowest 1 bit:
// AAAAAAAGHHHHHHH
// >> 7
// 0000000AAAAAAAG &
// 000000000000001 =
// 00000000000000G
gender = (packed_info >> 7) & 1;
// right shift the 'height' and 'gender' bits into the bit bucket, and grab the result:
// AAAAAAAGHHHHHHH
// >> 8
// 00000000AAAAAAA
age = (packed_info >> 8);
You can see the expression x & mask
as an operation that removes from x
the bits that are not present (i.e., have value 0) in mask
. That means, packed_info & 0x7F
removes from packed_info
all the bits that are above the seventh bit.
Example: if packed_info
is 1110010100101010
in binary, then packed_info & 0x7f
will be
1110010100101010
0000000001111111
----------------
0000000000101010
So, in height
we get the lower 7 bits of packed_info
.
Next, we are shifting the whole packed_info
by 7, this way we remove the information which we have already read out. So we get (for the value from previous example) 111001010
The gender is stored at the next bit, so with the same trick: & 1
we are extracting only that bit from the information. The rest of the information is contained at offset 8.
Packing back is not complicated, too: you take age
, shift it 8 bits (so you get 1110010100000000
from 11100101
), shift the gender
by 7 (so you get 00000000
), and take the height (assuming it would fit lower 7 bits). Then, you are composing all of them together:
1110010100000000
0000000000000000
0000000000101010
----------------
1110010100101010
精彩评论