hadoop benchmark - terasort [closed]
I built my own 4 nodes (namenode + 3xDatanodes) cluster for Hadoop.
now - I am tring to test its performance:took me 71 seconds:
hadoop jar $HADOOP_INSTALL/hadoop-examples.jar rando开发者_运维技巧mwriter random-data -test.randomwrite.bytes_per_map=5000000 -Dtest.randomwrite.total_bytes=50000000took me 218 seconds:
hadoop jar $HADOOP_INSTALL/hadoop-examples.jar sort random-data sorted-datatook me 368 seconds
hadoop jar $HADOOP_INSTALL/hadoop-test.jar testmapredsort -sortInput random-data -sortOutput sorted-datahow can i know if my cluster configured well ? what is the time it needs to take for my custer - node configuration:
4xIntel(R) Xeon(R) CPU E5645 @ 2.40GHz (6 cores each) 24 Gb RAMThanks.
I did a quick run with your prams on my cluster (1 namenode + 2 datanodes running Hadoop-0.21.0). It ended up taking 27 seconds, 23 seconds, 26 seconds respectively.
Tested with 4xIntel(R) Xeon(R) CPU E5607 @ 2.27GHz(4 cores each) 31GB RAM
I left hadoop config as is, but turned off the speculative tasks: mapred.map.tasks.speculative.execution, mapred.reduce.tasks.speculative.execution -> false
You can also play around with different settings of the block size (dfs.block.size, preferably bigger than default 128). See if that speeds things up.
More on hadoop benchmarking: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
精彩评论