Hadoop Distributed File System
namenode: Only one
datanode: many
namenode
has content of file name, directory, file property(time, replication,permission), and position of DataNode
- receive user request
- maintain file system directory structure
- manage relationship between file and block, relationship between block and datanode
datanode
- store file
- file are separated to blocks(default 128M), storing on hard drive
- also has checksum of the data
- to ensure safety, file will have multiple copies
Secondary NameNode
- monitor HDFS status, background assistant program
- grap snapshot of HDFS periodically
MapReduce
JobTracker: only one
- receive user job request
- allocate job to TaskTrackers
- Monitor TaskTracker running status
TaskTracker: many
- Run job which assigned from JobTracker
No comments:
Post a Comment