Convert byte to specific mask with bit hack
I have number with binary representation 0000abcd
.
How convert it to 0a0b0c0d
with smallest number of operations?
How convert 0a0b0c0d
back to 0000abcd
?
I was searching for a solution here: http://graphics.stanford.edu/~seander/bithacks.html and other
Generally the problem a bit more than described.
Given first number a₁b₁c₁d₁a₂b₂c₂d₂
and second number a₃a₄b₃b₄c₃c₄d₃d₄
If (a₁
and a₂
= 0
) then clear both a₃
and a₄
, if (a₃
and a₄
= 0
) then clear both a₁
and a₂
, etc.
My solution:
a₁b₁c₁d₁a₂b₂c₂d₂
OR 0 0 0 0 a₁b₁c₁d₁ ( a₁b₁c₁d₁a₂b₂c₂d₂ >> 4)
----------------
0 0 0 0 a b c d
? (magic transformation)
? ? ? ? ? ? ? ?
----------------
0 a 0 b 0 c 0 d
OR a 0 b 0 c 0 d 0 (0 a 0 b 0 c 0 d << 1)
----------------
a a b b c c d d
AND a₃a₄b₃b₄c₃c₄d₃d₄
----------------开发者_运维百科
A₃A₄B₃B₄C₃C₄D₃D₄ (clear bits)
UPDATED: (thanks for @AShelly)
x = a₁b₁c₁d₁a₂b₂c₂d₂
x = (x | x >> 4) & 0x0F
x = (x | x << 2) & 0x33
x = (x | x << 1) & 0x55
x = (x | x << 1)
y = a₃a₄b₃b₄c₃c₄d₃d₄
y = (y | y >> 1) & 0x55
y = (y | y >> 1) & 0x33
y = (y | y >> 2) & 0x0F
y = (y | y << 4)
work for 32-bit with constants 0x0F0F0F0F
, 0x33333333
, 0x55555555
(and twice long for 64-bit).
If you're looking for the smallest number of operations, use a look-up table.
I have number with binary representation 0000abcd. How convert it to 0a0b0c0d with smallest number of operations?
Isn't this exactly "Interleave bits of X and Y" where Y is 0? Bit Twiddling Hacks has Multiple Solutions that don't use a lookup table.
How convert 0a0b0c0d back to 0000abcd?
See "How to de-interleave bits (UnMortonizing?)"
You can't do it in one go, you should shift bits on per bit basis:
Pseudo code:
X1 = a₁b₁c₁d₁
X2 = a₂b₂c₂d₂
Bm = 1 0 0 0 // Bit mask
Result = 0;
while (/* some bytes left */)
{
Result += (X1 and Bm) << 1 or (X2 and Bm);
Bm = Bm shr 1
Result = Result shl 2;
}
As a result you will get a1a2b1b2c1c2d1d2
I think it is not possible (without lookup table) to do it in less operations using binary arithmetic and x86 or x64 processor architecture. Correct me if I'm mistaken but your problem is about moving bits. Having the abcd
bits you want to get 0a0b0c0d
bits in one operation. The problem starts when you will look at how many bits the 'a','b','c' and 'd' has to travel.
'a' was 4-th, became 7-th, distance travelled 3 bits
'b' was 3-rd, became 5-th, distance travelled 2 bits
'c' was 2-nd, became 3-rd, distance travelled 1 bit
'd' was 1-st, became 1-st, distance travelled 0 bits
There is no such processor instruction that will move these bits dynamically to a different distance. Though if you have different input representations of the same number for free, for example you have precomputed several values which you are using in a cycle, than maybe it will be possible to gain some optimization, this is the effect you get when using additional knowledge about the topology. You just have to choose whether it will be:
[4 cycles, n^0 memory]
[2 cycles, n^1 memory]
[1 cycle , n^2 memory]
精彩评论