开发者

Using encryption with Hadoop

The Cloudera documentation says that Hadoop does not support on disk encryption. Would it be possible to use hardware encrypted hard drives wit开发者_StackOverflow社区h Hadoop?


eCryptfs can be used to do per-file encryption on each individual Hadoop node. It's rather tedious to setup, but it certainly can be done.

Gazzang offers a turnkey commercial solution built on top of eCryptfs to secure "big data" through encryption, and partners with several of the Hadoop and NoSQL vendors.

Gazzang's cloud-based Encryption Platform for Big Data helps organizations transparently encrypt data stored in the cloud or on premises, using advanced key management and process-based access control lists, and helping meet security and compliance requirements.

Full disclosure: I am one of authors and current maintainers of eCryptfs. I am also Gazzang's Chief Architect and a lead developer.


If you have mounted a file system on the drive then Hadoop can use the drive. HDFS stores its data in the normal OS file system. Hadoop will not know whether the drive is encrypted or not and it will not care.


Hadoop doesn't directly support encryption, though a compression codec can be used used for encryption/decryption. Here are more details about encryption and HDFS.

Regarding h/w based encryption, I think Hadoop should be able to work on it. As Spike mentioned, HDFS is like any other Java application and stores it's data in the normal OS file systems. FYI, MapR uses Direct I/O for better HDFS performance.


See also Intel's Rhino. Not open source yet...

https://github.com/intel-hadoop/project-rhino/ https://hadoop.intel.com/pdfs/IntelEncryptionforHadoopSolutionBrief.pdf

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜