ceil/floor in sse simd
Can anyone suggest a fast way to compute float
floor/ceil using pre-SSE4.1 SIMD? I need to correctly handle all the corner cases, e.g. when I have a float
value, that is not representable by 32-bit int.
Currently I'm using similar to the following code (I use C intrinsics, converted to asm for clarity):
;make many copies of the data
movaps xmm0, [float_value]
movaps xmm1, xmm0
movaps xmm2, xmm0
;check if the value is not too large in magnitude
andps xmm1, [exp_mask]
pcmpgtd xmm1, [max_exp]
;calculate the floor()
cvttps2dq xmm3, xmm2
psrld xmm2, 31
psubd xmm3, xmm2
cvtsq2ps xmm2, xmm3
;combine the results
andps xmm0, xmm1
andnps xmm1, xmm2
orps xmm0, xmm1
Is there a more effici开发者_开发问答ent way to check if the float value is not too large for 32bit int?
Here is some pseudocode for a single element that should be directly convertible into vector instructions:
float f;
int i = (int)f; /* 0x80000000 if out of range (as from cvtps2dq) */
if (i == 0x80000000)
return f;
else
return (float)i;
You would use your rounding mode for the cast to int
in the second line. You can also test the IE
flag in MXCSR
to detect out of range values.
精彩评论