• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (06): 1054-1062.

• Graphics and Images • Previous Articles     Next Articles

YOLOv5s algorithm optimization based on multi-scale feature extraction

LI Xiao-lin1,2,WANG Fu-gang1,2,ZHANG Peng-fei1,2,ZHANG Lin-yu1,2   

  1. (1.School of Communication and Information Engineering,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    2.Research Center of New Telecommunication Technology,
    Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
  • Received:2021-12-02 Revised:2022-05-01 Accepted:2023-06-25 Online:2023-06-25 Published:2023-06-16

Abstract: Object detection algorithms are widely used in unmanned driving, robot vision, industrial automation and other fields, and have important research value. Among many target detection algorithm, YOLOv5s has the advantages of fast detection speed and small parameter scale, but also has the problem of low detection accuracy. Aiming at the problem that the YOLOv5s standard convolution module has weak feature extraction capabilities and feature redundancy, two convolution modules based on multi-scale feature extraction are proposed. Firstly, a multi-receptive field convolution module is proposed to improve the feature extraction ability of the model. It obtains semantic information of different granularities through convolution kernels of multiple sizes. Secondly, a feature map convolution module is proposed to improve the diversity of feature maps. It uses a small number of standard convolution kernels and grouped convolutions to reduce the mutual constraints between feature channels. Finally, some standard convolution modules of YOLOv5s are replaced by multi-receptive field convolution module and feature map convolution module, and the improved algorithm in this paper is obtained.The experimental results on Pascal VOC data set show that the improved algorithm not only improves the detection accuracy, but also maintains the real-time detection ability of YOLOv5s. mAP_0.5 and mAP_0.5:0.95 are increased by 2.4% and 4.9% respectively, which proved the effectiveness of the improved algorithm. It is further verified on DOTA data set that the improved algorithm has good generalization ability in different environments.

Key words: object detection, multi-scale feature, receptive field, feature redundancy