Limitations of the Levenberg-Marquardt algorithm

2023-01-30 05:13 问答作者：

I am using Levenberg-Marquardt algorithm to minimize a non-linear function of 6 parameters. I have got about 50 data points for each minimization, but I do not get sufficiently accurate results. Does the fact, that my parameters differ from each other by a few orders of magnitudes can be so much significant? If yes, where should I look for the solution? If no, what kind of limitations of LMA you met in your work (it may help to find other problems with my applictaion)? Many Thanks for your help.

Edit: The problem I am trying to solve is to determine the best transformation T:

typedef struct 
{
    double x_translation, y_translation, z_translation; 
    double x_rotation, y_rotation, z_rotation;
} transform_3D;

to fit the set of 3D points to the bunch of 3D lines. In detail I have got a set of coordinates of 3D points and equations of corresponding 3D lines, which should go through those points (in ideal situation). The LMA is minimizing the summ of distances of the transfomed 3D points to corresponding 3D lines. The transform function is as follows:

cv::Point3d Geometry::transformation_3D(cv::Point3d point, transform_3D transformation)
{
    cv::Point3d p_odd,p_even;

    //rotation x
    p_odd.x=point.x;
    p_odd.y=point.y*cos(transformation.x_rotation)-point.z*sin(transformation.x_rotation); 
    p_odd.z=point.y*sin(transformation.x_rotation)+point.z*cos(transformation.x_rotation);

    //rotation y
    p_even.x=p_odd.z*sin(transformation.y_rotation)+p_odd.x*cos(transformation.y_rotation);
    p_even.y=p_odd.y;
    p_even.z=p_odd.z*cos(transformation.y_rotation)-p_odd.x*sin(transformation.y_rotation);

    //rotation z
    p_odd.x=p_even.x*cos(transformation.z_rotation)-p_even.y*sin(transformation.z_rotation);
    p_odd.y=p_even.x*sin(transformation.z_rotation)+p_even.y*cos(transformation.z_rotation);
    p_odd.z=p_even.z;

    //translation
    p_even.x=p_odd.x+transformation.x_translation;
    p_even.y=p_odd.y+transformation.y_translation;
    p_even.z=p_odd.z+transformation.z_translation;

    return p_even;
}

Hope this explanation will help a bit...

Edit2:

Some exemplary data is pasted below. 3D lines are described by the center point and the directional vector. Center point for all lines are (0,0,0) and 'uz' coordinate for each vector is equal to 1. Set of 'ux' coordinates of directional vectors:

-1.0986, -1.0986, -1.0986,
-1.0986, -1.0990, -1.0986,
-1.0986, -1.0986, -0.9995,
-0.9996, -0.9996, -0.9995,
-0.9995, -0.9995, -0.9996,
-0.9003, -0.9003, -0.9004,
-0.9003, -0.9003, -0.9003,
-0.9003, -0.9003, -0.8011,
-0.7020, -0.7019, -0.6028,
-0.5035, -0.5037, -0.4045,
-0.3052, -0.3053, -0.2062,
-0.1069, -0.1069, -0.1075,
-0.1070, -0.1070, -0.1069,
-0.1069, -0.1070, -0.0079,
-0.0079, -0.0079, -0.0078,
-0.0078, -0.0079, -0.0079,
 0.0914,  0.0914,  0.0913,
 0.0913,  0.0914,  0.0915,
 0.0914,  0.0914

Set of 'uy' coordinates of directional vectors:

-0.2032,  -0.0047,    0.1936,
0.3919,    0.5901,    0.7885,
0.9869,    1.1852,    -0.1040,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1936,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    1.0860,
0.9869,    1.1852,    1.0861,
0.9开发者_如何学Python865,    1.1853,    1.0860,
0.9870,    1.1852,    1.0861,
-0.2032,  -0.0047,    0.1937,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852,    -0.1039,
0.0944,    0.2927,    0.4911,
0.6894,    0.8877,    1.0860,
-0.2032,  -0.0047,    0.1935,
0.3919,    0.5902,    0.7885,
0.9869,    1.1852

and set of 3D points in (x. y. z. x. y. z. x. y. z. ...) form:

 {{0, 0, 0}, {0, 16, 0},   {0, 32, 0}, 
 {0, 48, 0}, {0, 64, 0},   {0, 80, 0},
 {0, 96, 0}, {0, 112,0},   {8, 8, 0},
 {8, 24, 0}, {8, 40, 0},   {8, 56, 0}, 
 {8, 72, 0}, {8, 88, 0},   {8, 104, 0}, 
 {16, 0, 0}, {16, 16,0},   {16, 32, 0}, 
{16, 48, 0}, {16, 64, 0},  {16, 80, 0}, 
{16, 96, 0}, {16, 112, 0}, {24, 104, 0}, 
{32, 96, 0}, {32, 112, 0}, {40, 104, 0},
{48, 96, 0}, {48, 112, 0}, {56, 104, 0},
{64, 96, 0}, {64, 112, 0}, {72, 104, 0}, 
{80, 0, 0},  {80, 16, 0},  {80, 32, 0},
{80,48, 0},  {80, 64, 0},  {80, 80, 0}, 
{80, 96, 0}, {80, 112, 0}, {88,  8, 0}, 
{88, 24, 0}, {88, 40, 0},  {88, 56, 0},
{88, 72, 0}, {88, 88, 0},  {88, 104, 0},
{96, 0, 0},  {96, 16, 0},  {96, 32, 0}, 
{96, 48,0},  {96, 64, 0},  {96, 80, 0}, 
{96, 96, 0}, {96, 112, 0}}

This is kind of an "easy" modelled data with very small rotations.

Well, the proper way of using Levenberg-Marquardt is that you need a good initial estimate (a "seed") for your parameters. Recall that LM is a variant of Newton-Raphson; as with such iterative algorithms, the quality of your starting point will make or break your iteration; either converging to what you want, converging to something completely different (not that unlikely to happen, especially if you have a lot of parameters), or shooting off into the wild blue yonder (diverges).

In any event, it would be more helpful if you could mention the model function you're fitting, and possibly a scatter plot of your data; it might go a long way towards finding a workable solution for this.

I would suggest you try using a different approach to indirectly find your rotation parameters, namely to use a 4x4 affine transformation matrix to incorporate the translation and rotation parameters.

This gets rid of the nonlinearity of the sine and cosine functions (which you can figure out after the fact).

The tough part would be to constrain the transformation matrix from shearing or scaling, which you don't want.

Here you have your problem modeled and running with Mathematica.

I used the "Levenberg-Marquardt" method.

This is why I asked for your data. With MY data, YOUR problems are always going to be easier:)

xnew[x_, y_, z_] := 
  RotationMatrix[rx, {1, 0, 0}].RotationMatrix[
     ry, {0, 1, 0}].RotationMatrix[rz, {0, 0, 1}].{x, y, z} + {tx, ty, tz};

(* Generate Sample Data*)
(* Angles 1/2,1/3,1/5 *)
(* traslation -> {1,2,3} *)
(* Minimum mean Noise 5% *)

data = Table[{{x, y, z},
  RotationMatrix[1/2, {1, 0, 0}].
  RotationMatrix[1/3, {0, 1, 0}].
  RotationMatrix[1/5, {0, 0, 1}].{x, y, z} +{1, 2, 3} +RandomReal[{-.05, .05}, 3]},
  {x, 0, 1, .1}, {y, 0, 1, .1}, {z, 0, 1, .1}];

data = Flatten[data, 2];

(* Now find the parameters*)
FindMinimum[
 Sum[SquaredEuclideanDistance[xnew[i[[1]] /. List -> Sequence], 
   i[[2]]], {i, data}]
 , {rx, ry, rz, tx, ty, tz}, Method -> "LevenbergMarquardt"]

Out:

{3.2423, {rx -> 0.500566, ry -> 0.334012, rz -> 0.199902, 
          tx -> 0.99985,  ty -> 1.99939,  tz -> 3.00021}}

(Within 1/1000 of the real values)

Edit

I worked a little with your data.
The problem is that your system is very bad conditioned. You need much more data to effectively calculate such small rotations.

These are the results I got:

Rotations in degrees:

rx = 179.99999999999999999999984968493536659553226696793
ry = 180.00000000000000000000006934755799995159952661222
rz = 180.0006286861217378980724139120849587855611645627

Traslations

tx = 48.503663696727576867196234527227830090575281353092
ty = 63.974139455057300403798198525151849767949596684232
tz = -0.99999999999999999999997957276716543927459921348549

I should calculate the errors, but I've no time right now.

BTW, rz = Pi + 0.000011 (in radians)

HTH!

Well, I used ceres-solver to solve this, but I did make a modification in your data . Instead of "uz=1.0", I used "uz=0.0" which makes this entirely a 2d data fitting.

I got the following results. trans: -88.6384, -16.3879, 0 rot: 0, 0, -6.97813e-05

After getting these results, manually calculated the sum of orthogonal distance of transformed points to the corresponding lines and got 0.0280452.

struct CostFunctor {
    CostFunctor(const double p[3],  double ux, double uy){
        p_[0] = p[0];p_[1] = p[1];p_[2] = p[2];
        n_[0] = ux; n_[1] = uy;
        n_[2] = 0.0;
        normalize(n_);
    }

    template <typename T>
    bool operator()(const T* const x, T* residual) const {
        T pDash[3];
        T pIn[3];
        T temp[3];
        pIn[0] = T(p_[0]);
        pIn[1] = T(p_[1]);
        pIn[2] = T(p_[2]);
        //transform the input point p_ to pDash
        xform(x, &pIn[0], &pDash[0]);
        //find dot(pDash, n), where n is the direction of line
        T pDashDotN = T(pDash[0]) * T(n_[0]) + T(pDash[1]) * T(n_[1]) + T(pDash[2]) * T(n_[2]);
        //projection of pDash along line
        temp[0] = pDashDotN * n_[0];temp[1] = pDashDotN * n_[1];temp[2] = pDashDotN * n_[2];
        //orthogonal vector from projection to point
        temp[0] = pDash[0] - temp[0];temp[1] = pDash[1] - temp[1];temp[2] = pDash[2] - temp[2];
        //squared error
        residual[0] = temp[0] * temp[0] + temp[1] * temp[1] + temp[2] * temp[2];
    return true;
    }
    //untransformed point
    double p_[3];

    double ux_;
    double uy_;
    //direction of line
    double n_[3];
};


template<typename T>
void  xform(const T *x, const T * inPoint, T *outPoint3) {
    T xTheta = x[3];
    T pOdd[3], pEven[3];
    pOdd[0] = inPoint[0];
    pOdd[1] = inPoint[1] * cos(xTheta) + inPoint[2] * sin(xTheta);
    pOdd[2] = -inPoint[1] * sin(xTheta) + inPoint[2] * cos(xTheta);

    T yTheta = x[4];
    pEven[0] = pOdd[0] * cos(yTheta) + pOdd[2] * sin(yTheta);
    pEven[1] = pOdd[1];
    pEven[2] = -pOdd[0] * sin(yTheta) + pOdd[2] * cos(yTheta);


    T zTheta = x[5];

    pOdd[0] = pEven[0] * cos(zTheta) - pEven[1] * sin(zTheta);
    pOdd[1] = pEven[0] * sin(zTheta) + pEven[1] * cos(zTheta);
    pOdd[2] = pEven[2];

    T xTrans = x[0], yTrans = x[1], zTrans = x[2];
    pOdd[0] += xTrans;
    pOdd[1] += yTrans;
    pOdd[2] += zTrans;

    outPoint3[0] = pOdd[0];
    outPoint3[1] = pOdd[1];
    outPoint3[2] = pOdd[2];
}

继续阅读：levenberg-marquardt math nonlinear-functions optimization

Limitations of the Levenberg-Marquardt algorithm

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？