开发者

hadoop benchmark - terasort [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 11 years ago.

I built my own 4 nodes (namenode + 3xDatanodes) cluster for Hadoop.

now - I am tring to test its performance:

took me 71 seconds:

hadoop jar $HADOOP_INSTALL/hadoop-examples.jar rando开发者_运维技巧mwriter random-data -test.randomwrite.bytes_per_map=5000000 -Dtest.randomwrite.total_bytes=50000000

took me 218 seconds:

hadoop jar $HADOOP_INSTALL/hadoop-examples.jar sort random-data sorted-data

took me 368 seconds

hadoop jar $HADOOP_INSTALL/hadoop-test.jar testmapredsort -sortInput random-data -sortOutput sorted-data

how can i know if my cluster configured well ? what is the time it needs to take for my custer - node configuration:

4xIntel(R) Xeon(R) CPU E5645 @ 2.40GHz (6 cores each)

24 Gb RAM

Thanks.


I did a quick run with your prams on my cluster (1 namenode + 2 datanodes running Hadoop-0.21.0). It ended up taking 27 seconds, 23 seconds, 26 seconds respectively.

Tested with 4xIntel(R) Xeon(R) CPU E5607 @ 2.27GHz(4 cores each) 31GB RAM

I left hadoop config as is, but turned off the speculative tasks: mapred.map.tasks.speculative.execution, mapred.reduce.tasks.speculative.execution -> false

You can also play around with different settings of the block size (dfs.block.size, preferably bigger than default 128). See if that speeds things up.

More on hadoop benchmarking: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜