Limitations of the Levenberg-Marquardt algorithm
I am using Levenberg-Marquardt algorithm to minimize a non-linear function of 6 parameters. I have got about 50 data points for each minimization, but I do not get sufficiently accurate results. Does the fact, that my parameters differ from each other by a few orders of magnitudes can be so much significant? If yes, where should I look for the solution? If no, what kind of limitations of LMA you met in your work (it may help to find other problems with my applictaion)? Many Thanks for your help.
Edit: The problem I am trying to solve is to determine the best transformation T:
typedef struct
{
double x_translation, y_translation, z_translation;
double x_rotation, y_rotation, z_rotation;
} transform_3D;
to fit the set of 3D points to the bunch of 3D lines. In detail I have got a set of coordinates of 3D points and equations of corresponding 3D lines, which should go through those points (in ideal situation). The LMA is minimizing the summ of distances of the transfomed 3D points to corresponding 3D lines. The transform function is as follows:
cv::Point3d Geometry::transformation_3D(cv::Point3d point, transform_3D transformation)
{
cv::Point3d p_odd,p_even;
//rotation x
p_odd.x=point.x;
p_odd.y=point.y*cos(transformation.x_rotation)-point.z*sin(transformation.x_rotation);
p_odd.z=point.y*sin(transformation.x_rotation)+point.z*cos(transformation.x_rotation);
//rotation y
p_even.x=p_odd.z*sin(transformation.y_rotation)+p_odd.x*cos(transformation.y_rotation);
p_even.y=p_odd.y;
p_even.z=p_odd.z*cos(transformation.y_rotation)-p_odd.x*sin(transformation.y_rotation);
//rotation z
p_odd.x=p_even.x*cos(transformation.z_rotation)-p_even.y*sin(transformation.z_rotation);
p_odd.y=p_even.x*sin(transformation.z_rotation)+p_even.y*cos(transformation.z_rotation);
p_odd.z=p_even.z;
//translation
p_even.x=p_odd.x+transformation.x_translation;
p_even.y=p_odd.y+transformation.y_translation;
p_even.z=p_odd.z+transformation.z_translation;
return p_even;
}
Hope this explanation will help a bit...
Edit2:
Some exemplary data is pasted below. 3D lines are described by the center point and the directional vector. Center point for all lines are (0,0,0) and 'uz' coordinate for each vector is equal to 1. Set of 'ux' coordinates of directional vectors:
-1.0986, -1.0986, -1.0986,
-1.0986, -1.0990, -1.0986,
-1.0986, -1.0986, -0.9995,
-0.9996, -0.9996, -0.9995,
-0.9995, -0.9995, -0.9996,
-0.9003, -0.9003, -0.9004,
-0.9003, -0.9003, -0.9003,
-0.9003, -0.9003, -0.8011,
-0.7020, -0.7019, -0.6028,
-0.5035, -0.5037, -0.4045,
-0.3052, -0.3053, -0.2062,
-0.1069, -0.1069, -0.1075,
-0.1070, -0.1070, -0.1069,
-0.1069, -0.1070, -0.0079,
-0.0079, -0.0079, -0.0078,
-0.0078, -0.0079, -0.0079,
0.0914, 0.0914, 0.0913,
0.0913, 0.0914, 0.0915,
0.0914, 0.0914
Set of 'uy' coordinates of directional vectors:
-0.2032, -0.0047, 0.1936,
0.3919, 0.5901, 0.7885,
0.9869, 1.1852, -0.1040,
0.0944, 0.2927, 0.4911,
0.6894, 0.8877, 1.0860,
-0.2032, -0.0047, 0.1936,
0.3919, 0.5902, 0.7885,
0.9869, 1.1852, 1.0860,
0.9869, 1.1852, 1.0861,
0.9开发者_如何学Python865, 1.1853, 1.0860,
0.9870, 1.1852, 1.0861,
-0.2032, -0.0047, 0.1937,
0.3919, 0.5902, 0.7885,
0.9869, 1.1852, -0.1039,
0.0944, 0.2927, 0.4911,
0.6894, 0.8877, 1.0860,
-0.2032, -0.0047, 0.1935,
0.3919, 0.5902, 0.7885,
0.9869, 1.1852
and set of 3D points in (x. y. z. x. y. z. x. y. z. ...) form:
{{0, 0, 0}, {0, 16, 0}, {0, 32, 0},
{0, 48, 0}, {0, 64, 0}, {0, 80, 0},
{0, 96, 0}, {0, 112,0}, {8, 8, 0},
{8, 24, 0}, {8, 40, 0}, {8, 56, 0},
{8, 72, 0}, {8, 88, 0}, {8, 104, 0},
{16, 0, 0}, {16, 16,0}, {16, 32, 0},
{16, 48, 0}, {16, 64, 0}, {16, 80, 0},
{16, 96, 0}, {16, 112, 0}, {24, 104, 0},
{32, 96, 0}, {32, 112, 0}, {40, 104, 0},
{48, 96, 0}, {48, 112, 0}, {56, 104, 0},
{64, 96, 0}, {64, 112, 0}, {72, 104, 0},
{80, 0, 0}, {80, 16, 0}, {80, 32, 0},
{80,48, 0}, {80, 64, 0}, {80, 80, 0},
{80, 96, 0}, {80, 112, 0}, {88, 8, 0},
{88, 24, 0}, {88, 40, 0}, {88, 56, 0},
{88, 72, 0}, {88, 88, 0}, {88, 104, 0},
{96, 0, 0}, {96, 16, 0}, {96, 32, 0},
{96, 48,0}, {96, 64, 0}, {96, 80, 0},
{96, 96, 0}, {96, 112, 0}}
This is kind of an "easy" modelled data with very small rotations.
Well, the proper way of using Levenberg-Marquardt is that you need a good initial estimate (a "seed") for your parameters. Recall that LM is a variant of Newton-Raphson; as with such iterative algorithms, the quality of your starting point will make or break your iteration; either converging to what you want, converging to something completely different (not that unlikely to happen, especially if you have a lot of parameters), or shooting off into the wild blue yonder (diverges).
In any event, it would be more helpful if you could mention the model function you're fitting, and possibly a scatter plot of your data; it might go a long way towards finding a workable solution for this.
I would suggest you try using a different approach to indirectly find your rotation parameters, namely to use a 4x4 affine transformation matrix to incorporate the translation and rotation parameters.
This gets rid of the nonlinearity of the sine and cosine functions (which you can figure out after the fact).
The tough part would be to constrain the transformation matrix from shearing or scaling, which you don't want.
Here you have your problem modeled and running with Mathematica.
I used the "Levenberg-Marquardt" method.
This is why I asked for your data. With MY data, YOUR problems are always going to be easier:)
xnew[x_, y_, z_] :=
RotationMatrix[rx, {1, 0, 0}].RotationMatrix[
ry, {0, 1, 0}].RotationMatrix[rz, {0, 0, 1}].{x, y, z} + {tx, ty, tz};
(* Generate Sample Data*)
(* Angles 1/2,1/3,1/5 *)
(* traslation -> {1,2,3} *)
(* Minimum mean Noise 5% *)
data = Table[{{x, y, z},
RotationMatrix[1/2, {1, 0, 0}].
RotationMatrix[1/3, {0, 1, 0}].
RotationMatrix[1/5, {0, 0, 1}].{x, y, z} +{1, 2, 3} +RandomReal[{-.05, .05}, 3]},
{x, 0, 1, .1}, {y, 0, 1, .1}, {z, 0, 1, .1}];
data = Flatten[data, 2];
(* Now find the parameters*)
FindMinimum[
Sum[SquaredEuclideanDistance[xnew[i[[1]] /. List -> Sequence],
i[[2]]], {i, data}]
, {rx, ry, rz, tx, ty, tz}, Method -> "LevenbergMarquardt"]
Out:
{3.2423, {rx -> 0.500566, ry -> 0.334012, rz -> 0.199902,
tx -> 0.99985, ty -> 1.99939, tz -> 3.00021}}
(Within 1/1000 of the real values)
Edit
I worked a little with your data.
The problem is that your system is very bad conditioned. You need much more data to effectively calculate such small rotations.
These are the results I got:
Rotations in degrees:
rx = 179.99999999999999999999984968493536659553226696793
ry = 180.00000000000000000000006934755799995159952661222
rz = 180.0006286861217378980724139120849587855611645627
Traslations
tx = 48.503663696727576867196234527227830090575281353092
ty = 63.974139455057300403798198525151849767949596684232
tz = -0.99999999999999999999997957276716543927459921348549
I should calculate the errors, but I've no time right now.
BTW, rz = Pi + 0.000011 (in radians)
HTH!
Well, I used ceres-solver to solve this, but I did make a modification in your data . Instead of "uz=1.0", I used "uz=0.0" which makes this entirely a 2d data fitting.
I got the following results. trans: -88.6384, -16.3879, 0 rot: 0, 0, -6.97813e-05
After getting these results, manually calculated the sum of orthogonal distance of transformed points to the corresponding lines and got 0.0280452.
struct CostFunctor {
CostFunctor(const double p[3], double ux, double uy){
p_[0] = p[0];p_[1] = p[1];p_[2] = p[2];
n_[0] = ux; n_[1] = uy;
n_[2] = 0.0;
normalize(n_);
}
template <typename T>
bool operator()(const T* const x, T* residual) const {
T pDash[3];
T pIn[3];
T temp[3];
pIn[0] = T(p_[0]);
pIn[1] = T(p_[1]);
pIn[2] = T(p_[2]);
//transform the input point p_ to pDash
xform(x, &pIn[0], &pDash[0]);
//find dot(pDash, n), where n is the direction of line
T pDashDotN = T(pDash[0]) * T(n_[0]) + T(pDash[1]) * T(n_[1]) + T(pDash[2]) * T(n_[2]);
//projection of pDash along line
temp[0] = pDashDotN * n_[0];temp[1] = pDashDotN * n_[1];temp[2] = pDashDotN * n_[2];
//orthogonal vector from projection to point
temp[0] = pDash[0] - temp[0];temp[1] = pDash[1] - temp[1];temp[2] = pDash[2] - temp[2];
//squared error
residual[0] = temp[0] * temp[0] + temp[1] * temp[1] + temp[2] * temp[2];
return true;
}
//untransformed point
double p_[3];
double ux_;
double uy_;
//direction of line
double n_[3];
};
template<typename T>
void xform(const T *x, const T * inPoint, T *outPoint3) {
T xTheta = x[3];
T pOdd[3], pEven[3];
pOdd[0] = inPoint[0];
pOdd[1] = inPoint[1] * cos(xTheta) + inPoint[2] * sin(xTheta);
pOdd[2] = -inPoint[1] * sin(xTheta) + inPoint[2] * cos(xTheta);
T yTheta = x[4];
pEven[0] = pOdd[0] * cos(yTheta) + pOdd[2] * sin(yTheta);
pEven[1] = pOdd[1];
pEven[2] = -pOdd[0] * sin(yTheta) + pOdd[2] * cos(yTheta);
T zTheta = x[5];
pOdd[0] = pEven[0] * cos(zTheta) - pEven[1] * sin(zTheta);
pOdd[1] = pEven[0] * sin(zTheta) + pEven[1] * cos(zTheta);
pOdd[2] = pEven[2];
T xTrans = x[0], yTrans = x[1], zTrans = x[2];
pOdd[0] += xTrans;
pOdd[1] += yTrans;
pOdd[2] += zTrans;
outPoint3[0] = pOdd[0];
outPoint3[1] = pOdd[1];
outPoint3[2] = pOdd[2];
}
精彩评论