HDFS WRITE
2019-12-24 10:43:29 0 举报
HDFS WRITE
作者其他创作
大纲/内容
lastCellSize * parityNum
hedgedFetchBlockByteRange里的关键代码
3~4M
0~128Mblock0 rep1
block9
block1
0~1M
dataSize
1~2M
4~5M
blk_last
767~768M
5~6M
... ...
5.writeLock and processIBR
副本
Blk_Group
Block0
写chunk
packet
data cell
...
1.heartBeat (per 3s)
strip1
DataStreamers[data+parity]
NameNode
writeUnlock()
ack
创建及访问时间推移
parity cell
Stripe
直到写满一个Packet如126个chunk
EC
分割线
32% dns and every dn chose dfs.block.invalidate.limit = 1000 invalidateBlocks from BlockManager.invalidateBlocks to DatanodeDescriptor.invalidateBlocks
1. 获得DFSOutputStream输出流
2. 调用Rpc,然后通过构造器获得输出流,并启动输出流主线程
strip
6~7M
blk_0
block0
DatanodeDescriptor.invalidateBlocks
0~128Mrep2
128~256Mblock1 rep2
cell-parity
128~256Mblock1 rep1
762~763M
叶子目录下非空文件大小>=6M,则EC存储
Thread/Monitor:BlockManager$RedundancyMonitor
blk_1
... ....
1.从NameSpace中移除INode(目录或文件),并收集无效块到toRemovedBlocks
DN
strip2
Data Cell
case2:first cell is not full
热数据
2~3M
3.delete
dn[i]
dataQueue
blk_group_0
0~128Mblock0 rep3
DataNode1
3副本存储
申请块组blockGroup[data+parity]如9个块对应9个DataStreamer
~
ResponseProcessor
2. DFSoutputStream.newStreamForCreate()
2.cmds
blk
冷数据
ResponseProcessors
dn[k]
writeLock()
0~128Mblock0 rep2
cellSize * parityNum * ( stripNum -1 )
period: dfs.namenode.redundancy.interval.seconds = 3
pipeline
4.IBR
NN
BlockManager.invalidateBlocks
blk_group_k
currentStreamer
blk_k
一个packet
DataStreamer
首次写或块已满,则会向NN申请块;同时创建pipeline
addToInvalidates -> BlockManager.invalidateBlocks
DFSOutputSteam -> 主线程|- DataStreamer -> 发送数据|--ResponseProcessor -> 响应发送
Parity Cell
every dfs.namenode.block.deletion.increment = 1000
类似dn的PacketProcessor
ackQueue
一个cell最多16个packet写满一个cell,currentStreamer后移
DataNode0
DFSInputStream#pread |DFSInputStream#hedgedFetchBlockByteRange
case1:first cell is full
FsNameSystem.delete()
128~256Mblock1 rep3
cell-data
DataNode2
三个线程类
2.将待删除的块加到invalidateBlocks中
1. DistributedFileSystem.create()
0~128Mrep3
0~128Mrep1
PIPELINE
7~8M
stripe
DFSClient.javaprivate static ThreadPoolExecutor HEDGED_READ_THREAD_POOL;ThreadPoolExecutor getHedgedReadsThreadPool() { return HEDGED_READ_THREAD_POOL;}
0 条评论
下一页