I am getting this error with a mostly out of the box configuration from version 0.20.203.0 Where should I look for a potential issue.Most of the configuration is out of the box.I was able to visit th
I would like to get information on all TaskAttempts of a Task of a Job on Hadoop. org.apache.hadoop.mapred.TaskReport gives information on running Attempts and successful Attempts, but I would like t
What is the best and fast way to achieve parallel copy to h开发者_C百科adoop from an NFS mount? We have a mount with huge number of files and we need to copy it into hdfs.
After Installation of mahout from (http://girlincomputerscience.blogspot.com/2010/11/apache-mahout.html).How to Run maho开发者_运维百科ut algo and from where i can get most popular as easy tutorial fo
Is this a bug or a set-upin NewsKMeansClustering.java, an example code given in chapter 9 of Mahout-in-Action?
I downloaded the Cloudera VM on my Windows 7 laptop to play around.I am trying to connect to the Hadoop instance running in the VM from
I need a variable that shared between reduce tasks and each of reduce tasks ca开发者_运维百科n read and write on it atomically.
I am using Cassandra with Hadoop for input and output. During the output reduce job, I got an error: 2011-08-10 03:54:04,326 WARN org.apache.hadoop.mapred.Child: Error running child
i want to integrate hadoop to pentaho data-integration,I found on pentaho site, in that site there is pentaho for hadoop, but it\'s commercial.i want开发者_运维技巧 to make my data-integration communi
I am using Hadoop + Cassandra. I use setInputSplitSize(1000) to not overload mappers (and receive out of heap memory) as default it is 64K. All 开发者_StackOverflow社区together I have only 2M lines to