• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (07): 1265-1272.

• 图形与图像 • 上一篇    下一篇

基于轻量级神经网络的目标检测研究

黄志强1,李军1,张世义2   

  1. (1.重庆交通大学机电与车辆工程学院,重庆 400074;2.重庆交通大学航运与船舶工程学院,重庆 400074)
  • 收稿日期:2020-12-03 修回日期:2021-06-23 接受日期:2022-07-25 出版日期:2022-07-25 发布日期:2022-08-17
  • 基金资助:
    重庆市轨道交通车辆系统集成与控制重点实验室项目(CSTC2015yfpt-zdsys30001)

Object detection research based on lightweight neural network

HUANG Zhi-qiang1,LI Jun1,ZHANG Shi-yi2   

  1. (1.School of Mechatronics and Vehicle Engineering,Chongqing Jiaotong University,Chongqing 400074;
    2.School of Shipping and Naval Architecture,Chongqing Jiaotong University,Chongqing 400074,China)

  • Received:2020-12-03 Revised:2021-06-23 Accepted:2022-07-25 Online:2022-07-25 Published:2022-08-17

摘要: 由于以CSPDarknet53为主干的YOLOv4神经网络参数量巨大,将其移植至手机等小型设备上时会降低其检测精度和速度,为了提高检测速度同时将检测精度控制在合理范围内,提出将原有的53层神经网络改为15层,并对其中的聚类算法进行优化,引入K-means++聚类算法对数据集进行分析,生成满足检测条件的Anchor Box;使用在负区间带有一定斜率的LeakyReLU激活函数代替存在梯度消失问题的Sigmoid激活函数,从而增强浅层网络的学习能力;同时考虑到Bounding Box与Anchor Box之间的中心距和宽高比具有一定的相关性,提出在原有损失函数的基础上增加相应的惩罚项生成LCIoU损失函数,使损失函数在反向传播时梯度下降的方向性更好。实验结果表明,改进后的CSPDarknet15神经网络在VOC2007数据集上检测的平均精度达到83.94%,检测一幅图像的时间为3 625 ms,与CSPDarknet53神经网络相比,检测速度提高了54.43%,能满足小型设备实时检测的速度和精度要求。

关键词: YOLOv4神经网络, K-means++聚类算法, LeakyReLU激活函数, LCIoU损失函数

Abstract: Due to the huge amount of parameters of the YOLOv4 neural network with CSPDarknet53 as the backbone, the detection accuracy and speed will be reduced when it is transplanted to small devices such as mobile phones. In order to improve the detection speed and control the detection accuracy within a reasonable range, this paper proposes to change the original 53-layer neural network to a 15-layer one, and optimizes its clustering algorithm. The K-means++ clustering algorithm is introduced to analyze the data set to generate an anchor box that satisfies the detection conditions. LeakyReLU activation function with a certain slope in the negative interval is used to replace the Sigmoid activation function with vanishing gradients, thereby enhancing the learning ability of the shallow network. At the same time, considering that the center distance and the aspect ratio between the Bounding Box and the Anchor Box have a certain correlation, The corresponding penalty term is added to the original loss function to generate the LCIoU loss function, so that the loss function has a better directionality of the gradient drop during back propagation. Experimental results show that the improved CSPDarknet15 neural network in the VOC2007 data set has an average detection accuracy of 83.94%, and the detection time of a picture is 3 625 ms. Compared with the CSPDarknet53 neural network, the detection speed is increased by 54.43%, which can meet the speed and accuracy requirements of real-time detection of small devices.

Key words: YOLOv4 neural network, K-means++ clustering algorithm, LeakyReLU activation function, LCIoU loss function