基于多尺度特征提取的YOLOv5s算法优化

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (06): 1054-1062.

基于多尺度特征提取的YOLOv5s算法优化

李校林1,2,王复港1,2,张鹏飞1,2,张琳玉1,2

(1.重庆邮电大学通信与信息工程学院，重庆 400065；2.重庆邮电大学通信新技术应用研究中心，重庆 400065)

收稿日期:2021-12-02 修回日期:2022-05-01 接受日期:2023-06-25 出版日期:2023-06-25 发布日期:2023-06-16

YOLOv5s algorithm optimization based on multi-scale feature extraction

LI Xiao-lin1,2，WANG Fu-gang1,2，ZHANG Peng-fei1,2，ZHANG Lin-yu1,2

(1.School of Communication and Information Engineering,
Chongqing University of Posts and Telecommunications,Chongqing 400065;
2.Research Center of New Telecommunication Technology,
Chongqing University of Posts and Telecommunications,Chongqing 400065,China)

Received:2021-12-02 Revised:2022-05-01 Accepted:2023-06-25 Online:2023-06-25 Published:2023-06-16

摘要/Abstract

摘要： 目标检测算法广泛应用于无人驾驶、机器人视觉和工业自动化等领域，具有重要研究价值。在众多目标检测算法中，YOLOv5s具有参数规模小和检测速度快的优点，但存在检测精度不高的问题。针对YOLOv5s标准卷积模块特征提取能力不强且存在特征冗余的问题，提出2个基于多尺度特征提取的卷积模块。首先提出多感受野卷积模块，通过多个尺寸的卷积核获取不同粒度的语义信息，以提高模型特征提取能力；然后提出特征图卷积模块，利用少量标准卷积核与分组卷积减少特征通道间的相互制约，提高特征图的多样性；最后使用多感受野卷积模块和特征图卷积模块替换YOLOv5s的部分标准卷积模块，得到本文的改进算法。在PASCAL VOC数据集上的实验结果表明，改进算法在提高了检测精度的同时还保证了YOLOv5s的实时检测能力，mAP_0.5和mAP_0.5:0.95分别提高了2.4%和4.9%，证明了改进算法的有效性;在DOTA数据集上进一步验证了改进算法在不同环境下具有良好的泛化能力。

关键词: 目标检测, 多尺度特征, 感受野, 特征冗余

Abstract: Object detection algorithms are widely used in unmanned driving, robot vision, industrial automation and other fields, and have important research value. Among many target detection algorithm, YOLOv5s has the advantages of fast detection speed and small parameter scale, but also has the problem of low detection accuracy. Aiming at the problem that the YOLOv5s standard convolution module has weak feature extraction capabilities and feature redundancy, two convolution modules based on multi-scale feature extraction are proposed. Firstly, a multi-receptive field convolution module is proposed to improve the feature extraction ability of the model. It obtains semantic information of different granularities through convolution kernels of multiple sizes. Secondly, a feature map convolution module is proposed to improve the diversity of feature maps. It uses a small number of standard convolution kernels and grouped convolutions to reduce the mutual constraints between feature channels. Finally, some standard convolution modules of YOLOv5s are replaced by multi-receptive field convolution module and feature map convolution module, and the improved algorithm in this paper is obtained.The experimental results on Pascal VOC data set show that the improved algorithm not only improves the detection accuracy, but also maintains the real-time detection ability of YOLOv5s. mAP_0.5 and mAP_0.5:0.95 are increased by 2.4% and 4.9% respectively, which proved the effectiveness of the improved algorithm. It is further verified on DOTA data set that the improved algorithm has good generalization ability in different environments.

Key words: object detection, multi-scale feature, receptive field, feature redundancy

李校林, 王复港, 张鹏飞, 张琳玉, . 基于多尺度特征提取的YOLOv5s算法优化[J]. 计算机工程与科学, 2023, 45(06): 1054-1062.

LI Xiao-lin, WANG Fu-gang, ZHANG Peng-fei, ZHANG Lin-yu, . YOLOv5s algorithm optimization based on multi-scale feature extraction[J]. Computer Engineering & Science, 2023, 45(06): 1054-1062.

编辑推荐

Metrics

阅读次数

全文

648

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	648

来源	本网站	其他网站

次数	541	107
比例	83%	17%

摘要

319

最新录用	在线预览	正式出版

0	0	319

	来源	本网站

	次数	319
	比例	100%

[1]	马金林, 闫琦, 马自萍. 西夏文字的多层掩码识别方法[J]. 计算机工程与科学, 2024, 46(12): 2227-2238.
[2]	曹雨淇, 徐慧英, 朱信忠, 黄晓, 陈晨, 周思瑜, 盛轲. 基于YOLOv8改进的打架斗殴行为识别算法：EFD-YOLO[J]. 计算机工程与科学, 2024, 46(10): 1825-1834.
[3]	陈清江, 邵菲, 王炫钧. 混合U型网络与Transformer的图像去模糊[J]. 计算机工程与科学, 2024, 46(10): 1843-1851.
[4]	陈磊, 梁正友, 孙宇, 蔡俊民. 多尺度特征融合的移动端单目深度估计研究[J]. 计算机工程与科学, 2024, 46(09): 1616-1524.
[5]	陈晨, 徐慧英, 朱信忠, 黄晓, 宋杰, 曹雨淇, 周思瑜, 盛轲. 基于YOLOv8 改进的室内行人跌倒检测算法FDW-YOLO[J]. 计算机工程与科学, 2024, 46(08): 1455-1465.
[6]	王泽宇, 徐慧英, 朱信忠, 李琛, 刘子洋, 王子奕. 基于YOLOv8改进的密集行人检测算法：MER-YOLO[J]. 计算机工程与科学, 2024, 46(06): 1050-1062.
[7]	张文豪, 瞿绍军. 基于双解码器结构的多尺度注意力特征融合网络的视网膜血管分割#br#[J]. 计算机工程与科学, 2023, 45(12): 2175-2185.
[8]	李卓璇, 周亚同. 改进DBNet的电商图像文字检测算法研究[J]. 计算机工程与科学, 2023, 45(11): 2008-2017.
[9]	崔克彬, 崔叶微. 基于卷积和Transformer的断路器动触头跟踪方法研究[J]. 计算机工程与科学, 2023, 45(07): 1236-1244.
[10]	霍爱清, 张书涵, 杨玉艳, 胥静蓉, 王泽文. 密集交通场景中改进YOLOv3目标检测优化算法[J]. 计算机工程与科学, 2023, 45(05): 878-884.
[11]	黄星威, 陈曦, 张塑凡. 改进特征金字塔的小目标深度学习模型[J]. 计算机工程与科学, 2023, 45(04): 734-742.
[12]	孙琪, 翟锐, 左方, 张玉涛, . 基于部分卷积和多尺度特征融合的人脸图像修复模型[J]. 计算机工程与科学, 2023, 45(02): 304-312.
[13]	王冠博, 赵一帆, 李波, 杨俊东, 丁洪伟. 改进YOLO v4-tiny的火焰实时检测[J]. 计算机工程与科学, 2022, 44(12): 2196-2205.
[14]	罗月童, 段昶, 江佩峰, 周波. 一种基于pix2pix改进的工业缺陷数据增强方法[J]. 计算机工程与科学, 2022, 44(12): 2206-2212.
[15]	张海燕, 付应娜, 丁桂江, 孟庆岩. 基于无锚框目标检测算法的多样性感受野注意力特征补偿[J]. 计算机工程与科学, 2022, 44(11): 1995-2002.