开发者

Is this a bug or a setup issue for using NewsKMeasnClustering.java

Is this a bug or a set-up in NewsKMeansClustering.java, an example code given in chapter 9 of Mahout-in-Action? I was running this program against a directory of sequence files. The output error message is as follows:

Exception in thread "m开发者_JAVA技巧ain" java.io.FileNotFoundException: File newsClusters/clustersclusteredPoints/part-m-00000 does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412) at mia.clustering.ch09.NewsKMeansClustering.main(NewsKMeansClustering.java:76)

The directory structure of my development environment is shown as follows as well:

~/workspaceMahout1/recommender/newsClusters% ls canopy-centroids clusters df-count dictionary.file-0 frequency.file-0 tfidf-vectors tf-vectors tokenized-documents wordcount ~/workspaceMahout1/recommender/newsClusters/clusters/clusteredPoints% ls part-m-00000

Afterwards, I change the code from the original one

new Path(clusterOutput+Cluster.CLUSTERED_POINTS_DIR +”/part-m-00000”), conf);

to

new Path(clusterOutput+”/clusteredPoints”+”/part-m-00000”), conf);

The program can go through without giving the above error messages. I would like to know is that a bug in the original code or are there any other hidden issues?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜