Mode for _mm_cmpistrm SSE4.2 intrinsic
I'm trying to figure out how to set the "mode" flag for the _mm_cmpistrm SSE4.2 intrinsic. I have a regular C string (char*) that I am loading into an __m128i type with _mm_lddqu_si128. I was going to use unsigned bytes with regular string comparison:
_SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_EACH
But I'm confused about what to set for the unit vs. bit mask. Here are the macros from smmintrin.h in GCC 4.3.2:
/* These macros specify the outp开发者_如何学Cut selection in _mm_cmpXstrm (). */
#define _SIDD_BIT_MASK 0x00
#define _SIDD_UNIT_MASK 0x40
I think I understand what the bit mask is: I will get a 1 in bits 0..15 if the char in that position differs between the two strings. But what does the unit mask do?
For _SIDD_BIT_MASK you'll get a mask that is all 1 if the strings are equal and all 0 if they are unequal; if you're doing a _SIDD_UBYTE_OPS then you'll get 16 bits returned (one for each character in the string).
With _SIDD_UNIT_MASK however you'll get the same mask but expended to 16 bytes instead. Eg bits 0..15 will all be 1 if the comparison of the first two characters in the string is true. And bits 16..31 for character two etc.
精彩评论