开发者

Does Cloudera Mountable HDFS provide deduplicaion

Looking at running a HDFS based storage cluster, and looking at a simple method of using the Mountable HDFS system through the Cloudera release.

The first question I ask is will this provide automatic deduplication of data?

The second question I ask if deduplication will be done, when all user delete files that contain the certain deduplicated block, does it then actually delete the block from st开发者_如何学Pythonorage or just the index/reference for that user?

Lastly, would this method include the Rainstor compression methods?

Thanks for your input


No, HDFS does not include data deduplication.

The architecture is mainly focused on optimally use sequential write/read patterns, so it is pretty much against deduplication as every deduplication approach I am aware of introduces a certain amount of random IO pattern.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜