SSE best way to set register to 0.0's and 1.0's?

2023-02-08 20:12 问答作者：

I am doing some sse vector3 math.

Generally, I set the 4th digit of my vector to 1.0f, as this makes most of my math work, but sometimes I need to set it to 0.0f.

So I want to change something like: (32.4f, 21.2f, -4.0f, 1.0f) to (32.4f, 21.2f, -4.0f, 0.0f)

I was wondering what the best method to doing so would be:

Convert to 4 floats, set 4th float, send back to SSE
xor a register with itself, then do 2 s开发者_开发技巧hufps
Do all the SSE math with 1.0f and then set the variables to what they should be when finished.
Other?

Note: The vector is already in a SSE register when I need to change it.

AND with a constant mask.

In assembly ...

myMask:
.long 0xffffffff, 0xffffffff, 0xffffffff, 0x00000000

...
andps  myMask, %xmm#

where # = {0, 1, 2, ....}

Hope this helps.

Assuming your original vector is in xmm0:

; xmm0 = [x y z w]
xorps %xmm1, %xmm1         ; [0 0 0 0]
pcmpeqs %xmm2, %xmm2       ; [1 1 1 1] 
movss %xmm1, %xmm2         ; [0 1 1 1]
pshufd $0x20, %xmm1, %xmm2 ; [1 1 1 0]
andps %xmm2, %xmm0         ; [x y z 0]

should be fast since it does not access memory.

If you want to do it without memory access, you could realize that the value 1 has a zero word in it, and the value zero is all zeroes. So, you can just copy the zero word to the other. If you have the 1 in the highest dword, pshufhw xmm0, xmm0, 0xa4 should do the trick:

(gdb) ni
4       pshufhw $0xa4, %xmm0, %xmm0
(gdb) p $xmm0.v4_float
$4 = {32.4000015, 21.2000008, -4, 1}
(gdb) ni
5       ret
(gdb) p $xmm0.v4_float
$5 = {32.4000015, 21.2000008, -4, 0}

The similar trick for the other locations is left as an excercise to the reader :)

pinsrw?

Why not multiply your vector element wise with [1 1 1 0]? I'm pretty sure there is an SSE instruction for element wise multiplication.

Then to go back to a vector with a 1 in the 4th dimension, just add [0 0 0 1]. Again there is an SSE instruction for that, too.

继续阅读：assembly c math sse vector

SSE best way to set register to 0.0's and 1.0's?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？