1 YoloV3
2021-03-23 17:29:26 2 举报
YoloV3结构
作者其他创作
大纲/内容
主要逻辑:1. 特征提取,特征整合(单特征层特征提取和降维)2. 特征融合(大特征层到小特征层)
model.DarkNet
y
* stride_h
WHC transpose to CHW
sigmoid
torch.cat: bs * 3hw * (x+y+w+h+conf+class_conf_list)
view: bs * 3 * h * w * 1-> bs * 3hw * 1
feature_26: torch.Tensor batch_size(1) * channels(512) * height(26) * width(26)
raw_image: PIL.Image.Imagescale_width: 416scale_height: 416
+ grid_x
* anchor_w
predict_13: torch.Tensor batch_size(1) * 3(x+y+w+h+conf+class_conf_list) * height(13) * width(13)
w
增加 batch_size 维度: [image]
scaled_image: PIL.Image.Image width(416) * height(416) * RGB
yolo_utils.letterbox_image
model.YoloNet
predict_52: torch.Tensor batch_size(1) * 3(x+y+w+h+conf+class_conf_list) * height(52) * width(52)
* anchor_h
model.YoloNet's predict_layer
predict_26: torch.Tensor batch_size(1) * 3(x+y+w+h+conf+class_conf_list) * height(26) * width(26)
torch.cat
images: torch.Tensor batch_size(1) * channels(RGB) * height * width
x
主要逻辑:1. 将预测框的 xywh 格式转换为(xmin,ymin,xmax,ymax)的格式2. 将预测结果的(class_conf_list)格式转换为(class_conf,class_label)的格式3. 根据 conf = obj_conf * class_conf 由预设的置信度阈值对预测框进行初步筛选4. 根据 torchvision.ops.nms 由预设的 iou 阈值对预测框进行最终筛选
scaled_image: PIL.Image.Image
as_tensor
predict_layer: torch.Tensor batch_size(1) * 3(x+y+w+h+conf+class_conf_list) * feature_height * feature_width
tonser 化
predict_output: torch.Tensor batch_size(1) * anchors_total_num(sum(3hw)) * (x+y+w+h+conf+class_conf_list)
conf
exp
feature_13: torch.Tensor batch_size(1) * channels(1024) * height(13) * width(13)
* stride_w
h
class_conf_list
yolo_utils.non_max_suppression
yolo_utils.DecodeBox
view: bs * 3 * h * w * 4 -> bs * 3hw * 4
+ grid_y
view: batch_size(1) * 3 * feature_height * feature_width * (x+y+w+h+conf+class_conf_list)
view: bs * 3 * h * w * class_conf_list-> bs * 3hw * class_conf_list
主要逻辑:1. 将输入图片提取五个特征层2. 返回后三个有效的特征层
feature_52: torch.Tensor batch_size(1) * channels(256) * height(52) * width(52)
0 条评论
下一页