开发者

Thrust: sort_by_key with zip_iterator performance

Problem

I am using sort_by_key with the values being passed using a zip_iterator. This sort_by_key is called many times, and after a certain iteration it becomes 10x slower! What is the cause of this drop in performance?

Symptom

I am sorting 3 vectors using sort_by_key, one of them acts as the key vector:

struct Segment
{
  int v[2];
};

thrust::device_vector<int> keyVec;
thrust::device_vector<int> valVec;
thrust::device_vector<Segment> segVec;

// ... code which fills these vectors ...

thrust::sort_by_key( keyVec.begin(), keyVec.end(),
                     make_zip_iterator( make_tuple( valVec.begin(), segVec.begin() ) ) );

The size of the vector is usually about 4 million. In the initial 2 times it is called, the sort_by_key takes 0.04s, in loop 3 it takes 0.1s and then degrades further to 0.3s for the rest of the loops. Thus, we see a 10x degradation in performance.

Extra Information

To ensure that the only factor of degradation was开发者_如何学JAVA sort_by_key, I replaced the above with manual sorting using a handwritten kernel:

thrust::device_vector<int> indexVec( keyVec.size() );
thrust::sequence( indexVec.begin(), indexVec.end() );

// Sort the keys and indexes
thrust::sort_by_key( keyVec.begin(), keyVec.end(), indexVec.begin() );

thrust::device_vector<int> valVec2( keyVec.size() );
thrust::device_vector<Segment> segVec2( keyVec.size() );

// Use index array and move vectors to destination
moveKernel<<< x, y >>>(
  toRawPtr( indexVec ),
  indexVec.size(),
  toRawPtr( valVec ),
  toRawPtr( segVec ),
  toRawPtr( valVec2 ),
  toRawPtr( segVec2 ) );

// Swap back into original vectors
valVec.swap( valVec2 );
segVec.swap( segVec2 );

This handwritten sort takes 0.03s and this performance is consistent across all iterations, unlike the performance drop seen with sort_by_key and zip_iterator.


For accurate timing across each loop, you need to use cudaThreadSynchronize at the end of each loop. The timings you are getting for the first two loops may not be the actual timing you are looking for.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜