SSE: convert __m128 and __m128i into two __m128d
Two related questions.
This is what my code needs to do with fairly large amount of data. It is done inside inner loops and the performance is important.
- Convert and array of __int32 into doubles (or convert __m128i into two __m128d).
- Convert and array of floats into doubles (or convert __m128 into two __m128d).
Basically, I need function with the following signatures:
void convert_int_to_double(__int32 const * input, double开发者_开发百科 * output);
void convert_float_to_double(float const * input, double * output);
Input and output pointers are aligned and the number of elements is a multiple of 4. The main problem is how to quickly unpack __m128 into two __m128d.
The intrinsics _mm_cvtepi32_pd and _mm_cvtps_pd convert the values to double.
This should be the loop:
__m128i* base_addr = ...;
for( int i = 0; i < cnt; ++i )
{
__m128i epi32 = _mm_load_si128( base_addr + i );
__m128d v0 = _mm_cvtepi32_pd( epi32 );
epi32 = _mm_srli_si128( epi32, 8 );
__m128d v1 = _mm_cvtepi32_pd( epi32 );
....
}
精彩评论