Hadoop Data Persistance in which format?
I have some experience with Lucene, I'm trying to understand how the data is actually stored in slave server in Hadoop framework?
开发者_如何学运维Do we create an index in Slave Server with set of attributes to describe Document we are storing? how does it works in reality ?
Data is split into blocks of a certain size, and then replicated to other nodes in the cluster for reliability. This process is handled by a single "Name Node" which keeps track of which blocks of data have gone where.
Hadoop provides you with a virtual filesystem, similar to Unix, which you can query using various Hadoop filesystem tools (ls, get, put etc)
This link should give you a comprehensive overview.
精彩评论