Fastest way to convert unsigned char 8 bits to actual numbers

2023-01-25 06:53 问答作者：

I am using an unsigned char to store 8 flags. Each flag represents the corner of a cube. So 00000001 will be corner 1 010001开发者_C百科00 will be corners 3 and 7 etc. My current solution is to & the result with 1,2,4,8,16,32,64 and 128, check whether the result is not zero and store the corner. That is, if (result & 1) corners.push_back(1);. Any chance I can get rid of that 'if' statement? I was hoping I could get rid of it with bitwise operators but I could not think of any.

A little background on why I want to get rid of the if statement. This cube is actually a Voxel which is part of a grid that is at least 512x512x512 in size. That is more than 134 million Voxels. I am performing calculations on each one of the Voxels (well, not exactly, but I won't go into too much detail as it is irrelevant here) and that is a lot of calculations. And I need to perform these calculations per frame. Any speed boost that is minuscule per function call will help with these amount of calculations. To give you an idea, my algorithm (at some point) needed to determine whether a float was negative, positive or zero (within some error). I had if statements in there and greater/smaller than checks. I replaced that with a fast float to int function and shaved of a quarter of a second. Currently, each frame in a 128x128x128 grid takes a little more than 4 seconds.

I would consider a different approach to it entirely: there are only 256 possibilities for different combinations of flags. Precalculate 256 vectors and index into them as needed.

std::vector<std::vector<int> > corners(256);
for (int i = 0; i < 256; ++i) {
    std::vector<int>& v = corners[i];
    if (i & 1) v.push_back(1);
    if (i & 2) v.push_back(2);
    if (i & 4) v.push_back(4);
    if (i & 8) v.push_back(8);
    if (i & 16) v.push_back(16);
    if (i & 32) v.push_back(32);
    if (i & 64) v.push_back(64);
    if (i & 128) v.push_back(128);
}

for (int i = 0; i < NumVoxels(); ++i) {
    unsigned char flags = GetFlags(i);
    const std::vector& v = corners[flags];

    ... // do whatever with v
}

This would avoid all the conditionals and having push_back call new which I suspect would be more expensive anyway.

If there's some operation that needs to be done if the bit is set and not if it's not, it seems you'll have to have a conditional of some kind somewhere. If it could be expressed as a calculation somehow, you could get around it like this, for example:

numCorners = ((result >> 0) & 1) + ((result >> 1) & 1) + ((result >> 2) & 1) + ...

Hackers's Delight, first page:

x & (-x) // isolates the lowest set bit
x & (x - 1) // clears the lowest set bit

Inlining your push_back method would also help (better create a function that receives all the flags together).

Usually if you need performance, you should design the whole system with that in mind. Maybe if you post more code it will be easier to help.

EDIT: here is a nice idea:

unsigned char LOG2_LUT[256] = {...};
int t;
switch (count_set_bits(flags)){
    case 8:     t = flags; 
                flags &= (flags - 1);       // clearing a bit that was set
                t ^= flags;                 // getting the changed bit
                corners.push_back(LOG2_LUT[t]);
    case 7:     t = flags; 
                flags &= (flags - 1);       
                t ^= flags;                 
                corners.push_back(LOG2_LUT[t]);
    case 6:     t = flags; 
                flags &= (flags - 1);       
                t ^= flags;                 
                corners.push_back(LOG2_LUT[t]);
    // etc...
};

count_set_bits() is a very known function: http://www-graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable

There is a way, it's not "pretty", but it works.

(result & 1)   && corners.push_back(1);
(result & 2)   && corners.push_back(2);
(result & 4)   && corners.push_back(3);
(result & 8)   && corners.push_back(4);
(result & 16)  && corners.push_back(5);
(result & 32)  && corners.push_back(6);
(result & 64)  && corners.push_back(7);
(result & 128) && corners.push_back(8);

it uses a seldom known feature of the C++ language: the boolean shortcut.

I've noted a similar algorithm in the OpenTTD code. It turned out to be utterly useless: you're faster off by not breaking down numbers like that. Instead, replace the iteration over the vector<> you have now by an iteration over the bits of the byte. This is far more cache-friendly.

I.e.

unsigned char flags = Foo(); // the value you didn't put in a vector<>
for (unsigned char c = (UCHAR_MAX >> 1) + 1; c !=0 ; c >>= 1)
{
  if (flags & c) 
    Bar(flags&c);
}

继续阅读：micro-optimization

Fastest way to convert unsigned char 8 bits to actual numbers

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？