File blocks on HDFS
Does Hadoop guarantee that differen开发者_JAVA百科t blocks from same file will be stored on different machines in the cluster? Obviously replicated blocks will be on different machines.
No. If you look at the HDFS Architecture Guide, you'll see (in the diagram) that file part-1
has a replication factor of 3, and is made up of three blocks labelled 2, 4, and 5. Note how blocks 2 and 5 are on the same Datanode in one case.
Apparently not: http://hadoop.apache.org/common/docs/r0.20.2/hdfs_user_guide.html#Rebalancer
On the contrary I think. Setting aside replication, each datanode stores each block of data as its own file in the local file system.
Well Hadoop does not guarantee that. Since that is a huge loss of security, if you are requesting a file within a job, a downed datanode will cause the complete job to fail. Just because a block is not available. Can't imagine a usecase for your question, maybe you can tell a bit more to understand what your intention really was.
精彩评论