Hadoop_开发者

开发者

Hadoop

相关标签：Mysql sql c django mongodb

Mahout/Hadoop: SQL to SequenceFile
I am starting to use Mahout for clustering, but I am having a hard time trying to convert a sql(mysql) dump to a mahout-compatible SequenceFile. I am using the code above.
问答阅读(2)
How to get names of the currently running hadoop jobs?
I need to get the list of job names that currently running, but hadoop -job list give me a list of jobIDs.
问答阅读(3)
Emitting a Matrix from a mapper in Hadoop
I am new to Hadoop map reduce, I wanted to know that there is some outputformat type which can allow me to emit a matrix (2d array) directly from the m开发者_如何学运维apper (without converting to 1d)
问答阅读(4)
Why is the elephantbird Pig JsonLoader only processing part of my file?
I\'m using Pig on Amazon\'s Elastic Map-Reduce to do batch analytics.My input files are on S3 and contain events that are represented by one JSON dictionary per line.I use the elephantbird JsonLoader
问答阅读(4)
Processing paraphragraphs in text files as single records with Hadoop
Simplifying my problem a bit, I have a set of text files with "records" that are delimited by double newline characters. Like
问答阅读(2)
hadoop not running in the multinode cluster
I have ajar file \"Tsp.jar\" that I made myself. This same jar files executes well in single node cluster setup of hadoop. However when I run it on a cluster comprising 2 machines, a laptop and deskto
问答阅读(4)
MapReduce shuffle/sort method
Somewhat of an odd question, but does anyone know what kind of sort MapReduce uses in the sort portion of shuffle/sort?I would think merge or insertion (in keeping with the whole MapReduce par开发者_如
问答阅读(9)
Exception while executing hadoop job remotely
I am trying to execute a Hadoop job on a remote hadoop cluster. Below is my code. Configuration conf = new Configuration();
问答阅读(8)
Interpreting output from mahout clusterdumper
I ran a clustering test on crawled pages (more than 25K docs ; personal data set). I\'ve done a clusterdump :
问答阅读(3)
Want to compare two consecutive jobs on Hadoop
I want to know if I can compare two consecutive jobs in Hadoop. If not I would appreciate if anyone can tell me how to proceed with that. To be precise, I want to compare the jobs in terms of what exa
问答阅读(2)

首页上一页第28页下一页共67页