Worse performance using Eigen than using my own class

2023-03-09 08:14 问答作者：

A couple of weeks ago I asked a question about the performance of matrix multiplication.

I was told that in order to enhance the performance of my program I should use some specialised matrix classes rather than my own class.

StackOverflow users recommended:

uBLAS
EIGEN
BLAS

At first I wanted to use uBLAS however reading documentation it turned out that this library doesn't support matrix-matrix multiplication.

After all I decided to use EIGEN library. So I exchanged my matrix class to Eigen::MatrixXd - however it turned out that now my application works even slower than before. Time before using EIGEN was 68 seconds and after exchanging my matrix class to EIGEN matrix program runs for 87 seconds.

Parts of program which take the most time looks like that

TemplateClusterBase* TemplateClusterBase::TransformTemplateOne( vector<Eigen::MatrixXd*>& pointVector, Eigen::MatrixXd& rotation ,Eigen::MatrixXd& scale,Eigen::MatrixXd& translation )
{   
    for (int i=0;i<pointVector.size();i++ )
    {
        //Eigen::MatrixXd outcome =
        Eigen::MatrixXd outcome = (rotation*scale)* (*pointVector[i])  + translation;
        //delete  prototypePointVector[i];      // ((rotation*scale)* (*prototypePointVector[i])  + translation).ConvertToPoint();
        MatrixHelper::SetX(*prototypePointVector[i],MatrixHelper::GetX(outcome));
        MatrixHelper::SetY(*pro开发者_如何学PythontotypePointVector[i],MatrixHelper::GetY(outcome));
        //assosiatedPointIndexVector[i]    = prototypePointVector[i]->associatedTemplateIndex = i;
    }

    return this;
}

and

Eigen::MatrixXd AlgorithmPointBased::UpdateTranslationMatrix( int clusterIndex )
{
    double membershipSum = 0,outcome = 0;
    double currentPower = 0;
    Eigen::MatrixXd outcomePoint = Eigen::MatrixXd(2,1);
    outcomePoint << 0,0;
    Eigen::MatrixXd templatePoint;
    for (int i=0;i< imageDataVector.size();i++)
    {
        currentPower =0; 
        membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m);
        outcomePoint.noalias() +=  (*imageDataVector[i] - (prototypeVector[clusterIndex]->rotationMatrix*prototypeVector[clusterIndex]->scalingMatrix* ( *templateCluster->templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]) ))*currentPower ;
    }

    outcomePoint.noalias() = outcomePoint/=membershipSum;
    return outcomePoint; //.ConvertToMatrix();
}

As You can see, these functions performs a lot of matrix operations. That is why I thought using Eigen would speed up my application. Unfortunately (as I mentioned above), the program works slower.

Is there any way to speed up these functions?

Maybe if I used DirectX matrix operations I would get better performance ?? (however I have a laptop with integrated graphic card).

If you're using Eigen's MatrixXd types, those are dynamically sized. You should get much better results from using the fixed size types e.g Matrix4d, Vector4d.

Also, make sure you're compiling such that the code can get vectorized; see the relevant Eigen documentation.

Re your thought on using the Direct3D extensions library stuff (D3DXMATRIX etc): it's OK (if a bit old fashioned) for graphics geometry (4x4 transforms etc), but it's certainly not GPU accelerated (just good old SSE, I think). Also, note that it's floating point precision only (you seem to be set on using doubles). Personally I'd much prefer to use Eigen unless I was actually coding a Direct3D app.

Make sure to have compiler optimization switched on (e.g. at least -O2 on gcc). Eigen is heavily templated and will not perform very well if you don't turn on optimization.

Which version of Eigen are you using? They recently released 3.0.1, which is supposed to be faster than 2.x. Also, make sure you play a bit with the compiler options. For example, make sure SSE is being used in Visual Studio:

C/C++ --> Code Generation --> Enable Enhanced Instruction Set

You should profile and then optimize first the algorithm, then the implementation. In particular, the posted code is quite innefficient:

for (int i=0;i<pointVector.size();i++ )
{
   Eigen::MatrixXd outcome = (rotation*scale)* (*pointVector[i])  + translation;

I don't know the library, so I won't even try to guess the number of unnecessary temporaries that you are creating, but a simple refactor:

Eigen::MatrixXd tmp = rotation*scale;
for (int i=0;i<pointVector.size();i++ )
{
   Eigen::MatrixXd outcome = tmp*(*pointVector[i])  + translation;

Can save you a good amount of expensive multiplications (and again, probably new temporary matrices that get discarded right away.

A couple of points.

Why are you multiplying rotation*scale inside of the loop when that product will have the same value each iteration? That is a lot of wasted effort.
You are using dynamically sized matrices rather than fixed sized matrices. Someone else mentioned this already, and you said you shaved off 2 sec.
You are passing arguments as a vector of pointers to matrices. This adds an extra pointer indirection and destroys any guarantee of data locality, which will give poor cache performance.
I hope this isn't insulting, but are you compiling in Release or Debug? Eigen is very slow in debug builds, because it uses lots of trivial templated functions that are optimized out of release but remain in debug.

Looking at your code, I am hesitant to blame Eigen for performance problems. However, most linear algebra libraries (including Eigen) are not really designed for your use case of lots of tiny matrices. In general, Eigen will be better optimized for 100x100 or larger matrices. You very well may be better off using your own matrix class or the DirectX math helper classes. The DirectX math classes are completely independent from your video card.

Looking back at your previous post and the code in there, my suggestion would be to use your old code, but improve its efficiency by moving things around. I'm posting on that previous question to keep the answers separate.

继续阅读：eigen matrix performance

Worse performance using Eigen than using my own class

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？