Hadoop_开发者

开发者

Hadoop

相关标签：Mysql sql c django mongodb

Hadoop Code - Git and SVN
All the Apache Hadoop Code is hosted in SVN. How does Git help in Had开发者_如何学运维oop development process? It\'s not clear from the below article.
问答阅读(5)
Setup Nutch 1.3 and Hadoop
I am a newbie to Nutch and Hadoop and trying to follow the tutorial here at http://wiki.apache.org/nutch/NutchHadoopTutorial.
问答阅读(3)
Large scale data processing Hbase vs Cassandra [closed]
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,o开发者_JS百科r expertise, but this question will likely soli
问答阅读(7)
How to Get Pig to Work with lzo Files?
So, I\'ve seen a couple of tutorials for this online, but each seems to say to do something different. Also, each of them doesn\'t seem to specify whether you\'re trying to get things to work on a rem
问答阅读(4)
Why is the right number of reduces in Hadoop 0.95 or 1.75?
The hadoop documentation states: The right number of reduces seems to be 0.95 or 1.75 multiplied by ( * mapred.tasktracker.reduce.tasks.maximum).
问答阅读(8)
Process entire files in Hadoop using Python code (preferably in Dumbo)
It seems a very common use case but so hard to do in 开发者_JAVA百科Hadoop (it is possible with WholeFileRecordReader class).
问答阅读(10)
HDFS path changing when trying to update files in HDFS
I am new to Hadoop and HDFS, so maybe it is something I am doing wrong when I copy from local (Ubuntu 10.04) to HDFS on a single node on localhost.The initial copy works fine, but when I modify my loc
问答阅读(5)
setCompressOutput in Hadoop
When should use and not to use FileOutputFormat.setCompressOutpu开发者_Python百科t(conf, true);? I heard that it compresses mapper output. Is there any possibility to compress reducer side output?
问答阅读(5)
Nutch on EMR problem reading from S3
Hi I am trying to run Apache Nutch 1.2 on Amazon\'s EMR. To do this I specifiy an input directory from S3.I get the following error:
问答阅读(2)
Hbase performance
I am using Spring + Datanucleus JDO + Hbase. Hbase is on a fully distributed mode with two nodes. I am facing serious performance issues here.
问答阅读(21)

首页上一页第9页下一页共67页