comparison with floats in neon intrinsics
I think this a silly problem but i tried for a day to resolve this with not luck, so here is.
i have register of four vectors (float32x4), and i want to make some process on some of them and the other i want to set it on 0's.
For example this problem in c:
for (int i=1; i<=4; i++)
{
float b = 4/i;
if(b<=3)
result += process(b);
}
so the first one will not process but the other will, so i need a register where the firs lane i have 0's and the other one have the result.
开发者_StackOverflow中文版But i don't know how to do this on neon intrinsics.
i know that there is a vcltq_f32 but i tried with this one and but with no result.
Like this:
const float32x4_t vector_3 = vdupq_n_f32(3.0f);
uint32x4_t mask = vcleq_f32(vector_b, vector_3);
vector_b = (float32x4_t)vandq_u32((uint32x4_t)vector_b, mask);
I don't know much about Neon but in most SIMD architectures you would do this by comparing and masking (bitwise AND). You use a compare instruction which then generates a mask which you can typically use for this.
精彩评论