Java MDSJ produces NaN
Anyone have any experience with MDSJ? The following input produces only NaN results and I can't figure out why. The documentation is pretty sparse.
import mdsj.Data;
import mdsj.MDSJ;
public class MDSJDemo {
public static void ma开发者_StackOverflow社区in(String[] args) {
double[][] input = {
{78.0, 60.0, 30.0, 25.0, 24.0, 7.125, 1600.0, 1.4953271028037383, 15.0, 60.0, 0.0, 0.0, 50.0},
{63.1578947368421, 51.81818181818182, 33.0, 30.0, 10.714285714285715, 6.402877697841727, 794.2857142857143, 0.823045267489712, 15.0, 20.0, 2.8571428571428568, 0.0, 75.0},
{55.714285714285715, 70.0, 16.363636363636363, 27.5, 6.666666666666666, 5.742574257425742, 577.1428571428571, 0.6542056074766355, 12.857142857142856, 10.0, 17.142857142857142, 0.0, 25.0}
};
int n=input[0].length; // number of data objects
double[][] output=MDSJ.classicalScaling(input); // apply MDS
System.out.println(Data.format(output));
for(int i=0; i<n; i++) { // output all coordinates
System.out.println(output[0][i]+" "+output[1][i]);
}
}
}
This is the output:
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
Perhaps I'm using MDS incorrectly. Each subarray of length 13 in input
is intended to represent one object, yet MDSJ is returning 13 points.
It also fails for this input:
double[][] input = {
{3, 4, 3},
{5, 6, 1},
{0, 1, 2}
};
EDIT: It appears that I have been using it wrong. I had been giving it an input like this:
Object A: {30d, 1d, 0d, 4.32, 234.1}
Object B: {45d, 3.21, 45, 91.2, 9.9}
Object C: {7.7, 93.1, 401, 0d, 0d}
But what it actually wants is a distance matrix like this:
A B C
A 0 3 1
B 3 0 5
C 1 5 0
Not exactly, though, because for this input:
double[][] input = {
{0, 3, 1},
{3, 0, 5},
{1, 5, 0}
};
I get this result:
0.8713351726043931 -2.361724203891451 2.645016918006963
NaN NaN NaN
0.8713351726043931 NaN
-2.361724203891451 NaN
2.645016918006963 NaN
But if it does want an array of distances, what is the point of using MDS in the first place? I thought it was supposed to boil an array of attributes down into coordinates.
Multidimensional Scaling turns distances into coordinates - if you already have coordinates in a high-dimensional space and want them embedded optimally in a low-dimensional space, Principal Components Analysis (PCA) is probably the technique you are looking for.
Classical MDS and PCA are closely related: First, MDS converts input distances into preliminary high-dimensional coordinates (the dimension being as high as the number of objects described); second, the dimensionality of these coordinates is reduced in a PCA-like step by getting rid of the least important axes.
The point of using MDS is that in some settings the input distances are not derived from existing coordinates, but from something else, which is non-geometric, for example, dissimilarity ratings made by people.
Your 3x3 dissimilarity matrix does not obey the triangle inequality needed in metric spaces (because d[1][0]+d[0][2]<d[1][2]) and can thus not be accurately embedded in a Euclidean space. Technically, the NaN values in the second dimension are due to the negative second eigenvalue of the modified dissimilarity matrix.
精彩评论