SSE best way to set register to 0.0's and 1.0's?
I am doing some sse vector3 math.
Generally, I set the 4th digit of my vector to 1.0f, as this makes most of my math work, but sometimes I need to set it to 0.0f.
So I want to change something like: (32.4f, 21.2f, -4.0f, 1.0f) to (32.4f, 21.2f, -4.0f, 0.0f)
I was wondering what the best method to doing so would be:
- Convert to 4 floats, set 4th float, send back to SSE
- xor a register with itself, then do 2 s开发者_开发技巧hufps
- Do all the SSE math with 1.0f and then set the variables to what they should be when finished.
- Other?
Note: The vector is already in a SSE register when I need to change it.
AND with a constant mask.
In assembly ...
myMask:
.long 0xffffffff, 0xffffffff, 0xffffffff, 0x00000000
...
andps myMask, %xmm#
where # = {0, 1, 2, ....}
Hope this helps.
Assuming your original vector is in xmm0:
; xmm0 = [x y z w]
xorps %xmm1, %xmm1 ; [0 0 0 0]
pcmpeqs %xmm2, %xmm2 ; [1 1 1 1]
movss %xmm1, %xmm2 ; [0 1 1 1]
pshufd $0x20, %xmm1, %xmm2 ; [1 1 1 0]
andps %xmm2, %xmm0 ; [x y z 0]
should be fast since it does not access memory.
If you want to do it without memory access, you could realize that the value 1 has a zero word in it, and the value zero is all zeroes. So, you can just copy the zero word to the other. If you have the 1 in the highest dword, pshufhw xmm0, xmm0, 0xa4
should do the trick:
(gdb) ni
4 pshufhw $0xa4, %xmm0, %xmm0
(gdb) p $xmm0.v4_float
$4 = {32.4000015, 21.2000008, -4, 1}
(gdb) ni
5 ret
(gdb) p $xmm0.v4_float
$5 = {32.4000015, 21.2000008, -4, 0}
The similar trick for the other locations is left as an excercise to the reader :)
pinsrw?
Why not multiply your vector element wise with [1 1 1 0]? I'm pretty sure there is an SSE instruction for element wise multiplication.
Then to go back to a vector with a 1 in the 4th dimension, just add [0 0 0 1]. Again there is an SSE instruction for that, too.
精彩评论