• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (06): 1054-1062.

• 图形与图像 • 上一篇    下一篇

基于多尺度特征提取的YOLOv5s算法优化

李校林1,2,王复港1,2,张鹏飞1,2,张琳玉1,2   

  1. (1.重庆邮电大学通信与信息工程学院,重庆 400065;2.重庆邮电大学通信新技术应用研究中心,重庆 400065)
  • 收稿日期:2021-12-02 修回日期:2022-05-01 接受日期:2023-06-25 出版日期:2023-06-25 发布日期:2023-06-16

YOLOv5s algorithm optimization based on multi-scale feature extraction

LI Xiao-lin1,2,WANG Fu-gang1,2,ZHANG Peng-fei1,2,ZHANG Lin-yu1,2   

  1. (1.School of Communication and Information Engineering,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    2.Research Center of New Telecommunication Technology,
    Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
  • Received:2021-12-02 Revised:2022-05-01 Accepted:2023-06-25 Online:2023-06-25 Published:2023-06-16

摘要: 目标检测算法广泛应用于无人驾驶、机器人视觉和工业自动化等领域,具有重要研究价值。在众多目标检测算法中,YOLOv5s具有参数规模小和检测速度快的优点,但存在检测精度不高的问题。针对YOLOv5s标准卷积模块特征提取能力不强且存在特征冗余的问题,提出2个基于多尺度特征提取的卷积模块。首先提出多感受野卷积模块,通过多个尺寸的卷积核获取不同粒度的语义信息,以提高模型特征提取能力;然后提出特征图卷积模块,利用少量标准卷积核与分组卷积减少特征通道间的相互制约,提高特征图的多样性;最后使用多感受野卷积模块和特征图卷积模块替换YOLOv5s的部分标准卷积模块,得到本文的改进算法。在PASCAL VOC数据集上的实验结果表明,改进算法在提高了检测精度的同时还保证了YOLOv5s的实时检测能力,mAP_0.5和mAP_0.5:0.95分别提高了2.4%和4.9%,证明了改进算法的有效性;在DOTA数据集上进一步验证了改进算法在不同环境下具有良好的泛化能力。

关键词: 目标检测, 多尺度特征, 感受野, 特征冗余

Abstract: Object detection algorithms are widely used in unmanned driving, robot vision, industrial automation and other fields, and have important research value. Among many target detection algorithm, YOLOv5s has the advantages of fast detection speed and small parameter scale, but also has the problem of low detection accuracy. Aiming at the problem that the YOLOv5s standard convolution module has weak feature extraction capabilities and feature redundancy, two convolution modules based on multi-scale feature extraction are proposed. Firstly, a multi-receptive field convolution module is proposed to improve the feature extraction ability of the model. It obtains semantic information of different granularities through convolution kernels of multiple sizes. Secondly, a feature map convolution module is proposed to improve the diversity of feature maps. It uses a small number of standard convolution kernels and grouped convolutions to reduce the mutual constraints between feature channels. Finally, some standard convolution modules of YOLOv5s are replaced by multi-receptive field convolution module and feature map convolution module, and the improved algorithm in this paper is obtained.The experimental results on Pascal VOC data set show that the improved algorithm not only improves the detection accuracy, but also maintains the real-time detection ability of YOLOv5s. mAP_0.5 and mAP_0.5:0.95 are increased by 2.4% and 4.9% respectively, which proved the effectiveness of the improved algorithm. It is further verified on DOTA data set that the improved algorithm has good generalization ability in different environments.

Key words: object detection, multi-scale feature, receptive field, feature redundancy