An improved dense pedestrian detection algorithm based on YOLOv8: MER-YOLO

Abstract

Abstract: In large-scale crowded places, abnormal crowd gathering occurs from time to time, which brings certain challenges to the dense pedestrian detection technology involved in application scenarios such as autonomous driving and large-scale public place crowd monitoring systems. The new generation of dense pedestrian detection technology requires higher accuracy, smaller computing overhead, faster detection speed and more convenient deployment. In view of the above requirements, a lightweight dense pedestrian detection algorithm MER-YOLO based on YOLOv8 is proposed, which first uses MobileViT as the backbone network to improve the overall feature extraction ability of the model in pedestrian gathering areas. The EMA attention mechanism module is introduced to encode the global information, further aggregate pixel-level features through dimensional interaction, and strengthen the detection ability of small targets by combining the detection head with 160×160 scale. The use of Repulsion Loss as the bounding box loss function reduces the missed detection and misdetection of small target pedestrians under dense crowds. The experimental results show that compared with YOLOv8n, the mAP@0.5 of the MER-YOLO pedestrian detection algorithm is improved by 4.5% on the Crowd Human dataset and 2.1% on the WiderPerson dataset, while only 3.1×106 parameters and 9.8 GFLOPs, which meet the deployment requirements of low computing power and high precision.

Key words: object detection, pedestrian detection, light weight, attention mechanism

CLC Number:

WANG Ze-yu, XU Hui-ying, ZHU Xin-zhong, LI Chen, LIU Zi-yang, WANG Zi-yi. An improved dense pedestrian detection algorithm based on YOLOv8: MER-YOLO[J]. Computer Engineering & Science, 2024, 46(06): 1050-1062.

References ［36］

［1］	Ge Z,Wang J F,Huang X,et al.LLa:Loss-aware label assignment for dense pedestrian detection［J］.Neurocomputing,2021,462:272-281.
［2］	Zhang Y, He H Y,Li J G,et al.Variational pedestrian detection ［C］∥Proc of 2021 IEEE Conference on Computer Vision and Pattern Recognition,2021:11622-11631.
［3］	魏润辰,何宁,尹晓杰.YOLO-Person：道路区域行人检测［J］.计算机工程与应用,2020,56(19):197-204.
	Wei Run-chen,He Ning,Yin Xiao-jie.YOLO-Person:Pedestrian detection in road areas［J］.Computer Engineering and Applications,2020,56(19):197-204.
［4］	邓杰,万旺根.基于改进YOLOv3的密集行人检测［J］.电子测量技术,2021,44(11):90-95.
	Deng Jie,Wan Wang-gen.Dense pedestrian detection based on improved YOLOv3［J］.Electronic Measurement Technol- ogy,2021,44(11):90-95.
［5］	张忠民,吴泽.基于改进YOLOv5的密集行人检测方法［J］.应用科技,2023,50(1):33-39.
	Zhang Zhong-min,Wu Ze.Dense pedestrian detection method based on improved YOLOv5［J］.Applied Science and Technology,2023,50(1):33-39.
［6］	Ding X H, Zhang X Y, Ma N N.RepVGG:Making VGG-style ConvNets great again［C］∥Proc of 2021 IEEE Confe- rence on Computer Vision and Pattern Recognition,2021:13728-13737.
［7］	Wang Q L, Wu B G, Zhu P F.ECA-Net:Efficient channel attention for deep convolutional neural networks［C］∥Proc of 2020 IEEE Conference on Computer Vision and Pattern Recognition,2020:11531-11539.
［8］	陈一潇,阿里甫·库尔班,林文龙，等.面向拥挤行人检测的CA-YOLOv5［J］.计算机工程与应用,2022,58(9):238-245.
	Chen Yi-xiao,Kurban A,Lin Wen-long,et al.CA-YOLOv5 for crowded pedestrian detection［J］.Computer Engineering and Applications,2022,58(9):238-245.
［9］	Hou Q B,Zhou D Q,Feng J S.Coordinate attention for efficient mobile network design［C］∥Proc of 2021 IEEE Confe- rence on Computer Vision and Pattern Recognition,2021:13713-13722.
［10］	Lin T Y,Dollár P,Girshick R,et al.Feature pyramid networks for object detection［C］∥Proc of 2017 IEEE Confe- rence on Computer Vision and Pattern Recognition,2017:936-944.
［11］	Zhang K,Xiong F,Sun P,et al.Double anchor R-CNN for human detection in a crowd［J］.arXiv:1909.09998,2019.
［12］	Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation［C］∥Proc of 2014 IEEE Conference on Computer Vision and Pattern Recognition,2014:580-587.
［13］	Ren S Q,He K M,Girshick R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149.
［14］	Mehta S, Rastegari M.MobileViT:Light-weight,general-purpose,and mobile-friendly vision transformer［J］.arXiv:2110.02178,2022.
［15］	Ouyang D L,He S,Zhang G Z,et al.Efficient multi-scale attention module with cross-spatial learning［C］∥Proc of 2023 IEEE International Conference on Acoustics,Speech and Signal Processing,2023:1-5.
［16］	Wang X L,Xiao T T,Jiang Y N,et al.Reploss loss:Detecting pedestrians in a crowd［C］∥Proc of 2018 IEEE Confe- rence on Computer Vision and Pattern Recognition,2018:7774-7783.
［17］	Shuai S, Zhao Z J,Li B X.CrowdHuman:A benchmark for detecting human in a crowd［J］ arXiv:1805.00123,2018.
［18］	Zhang S S,Benenson R,Schiele B.CityPersons:A diverse dataset for pedestrian detection［C］∥Proc of 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:3213-3221.
［19］	Zhang S F,Xie Y L,Wan J.WiderPerson:A diverse dataset for dense pedestrian detection in the wild ［J］. IEEE Tran- sactions on Multimedia,2019,22(2):380-393.
［20］	Hu J, Shen L, Sun G.Squeeze-and-excitation networks［C］∥Proc of 2018 IEEE Conference on Computer Vision and Pattern Recognition,2018:7132-7141.
［21］	Woo S,Park J,Lee J-Y,et al. CBAM:Convolutional block attention module［C］∥Proc of the 15th European Conference on Computer Vision,2018:3-19.
［22］	Li X, Hu X L,Yang J. Spatial group-wise enhance:Improving semantic feature learning in convolutional networks［J］.arXiv:1905.09646,2019.
［23］	Zhu X K，Lyu S C，Wang X，et al.TPH-YOLOv5：Improved YOLOv5 based on transformer prediction head for object detection on drone［C］∥Proc of 2021 IEEE Conference on Computer Vision and Pattern Recognition,2021：2778-2788.
［24］	Chen J R, Kao S H,Hao H, et al. Run, don’t walk:Chasing higher FLOPS for faster neural networks［C］∥Proc of 2023 IEEE Conference on Computer Vision and Pattern Recognition,2023:12021-12031.
［25］	Tang Y H, Han K, Guo J Y.GhostNetV2:Enhance cheap operation with long-range attention［C］∥Proc of 2020 IEEE Conference on Computer Vision and Pattern Recognition,2020:1577-1586.
［26］	Tan M,Le Q V.EfficientNet:Rethinking model scaling for convolutional neural networks［C］∥Proc of the 36th International Conference on Machine Learning,2019:2-16.
［27］	Liu Y,Shao Z,Teng Y,et al.NAM:Normalization-based attention module［J］. arXiv:2111.12419,2021.
［28］	Li Y,Yao T,Pan Y,et al.Contextual transformer networks for visual recognition［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(2):1489-1500.
［29］	Yang L X, Zhang R Y, Li L D, et al.SimAM:A simple, parameter-free attention module for convolutional neural networks［C］∥Proc of the 38th International Conference on Machine Learning,2021:11863-11874.
［30］	Liu Y C,Shao Z R,Hoffmann N.Global attention mechanism:Retain information to enhance channel-spatial interactions［J］.arXiv:2112.05561,2021.
［31］	Liu W,Anguelov D,Erhan D,et al.SSD:Single shot multibox detector［C］∥Proc of the 14th European Conference on Computer Vision,2016:21-37.
［32］	Girshick R.Fast R-CNN［C］∥Proc of the 2015 IEEE International Conference on Computer Vision,2015:1440-1448.
［33］	Lü W Y，Zhao Y，Xu S L.DETRs beat YOLOs on real-time object detection［J］. arXiv:2304.08069,2023.
［34］	Li C Y,Li L L,Jiang H L,et al.YOLOv6:A single-stage object detection framework for industrial applications［J］. arXiv:2209.02976,2022.
［35］	Wang C Y,Bochkovskiy A,Liao H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors［C］∥Proc of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition,2023:7464-7475.
［36］	Selvaraju R R,Cogswell M,Das A.Grad-CAM:Visual explanations from deep networks via gradient-based localization ［C］∥Proc of 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:618-626.
	作者简介:

[1]	LIANG Xiu-man, ZHOU Jia-run, YANG Ruo-lan. LPD-YOLO:Lightweight obscured pedestrian detection model [J]. Computer Engineering & Science, 2023, 45(12): 2197-2205.
[2]	YIN Chun-yong, FENG Meng-xue. A semi-supervised log anomaly detection method based on attention mechanism [J]. Computer Engineering & Science, 2023, 45(08): 1405-1415.
[3]	CAO Yu-dong, CHEN Dong-hao, CAO Rui, ZHAO Lang. An online multi-pedestrian tracking method with Mask R-CNN [J]. Computer Engineering & Science, 2023, 45(07): 1216-1225.
[4]	LI Xiao-lin, WANG Fu-gang, ZHANG Peng-fei, ZHANG Lin-yu, . YOLOv5s algorithm optimization based on multi-scale feature extraction [J]. Computer Engineering & Science, 2023, 45(06): 1054-1062.
[5]	HUANG Xing-wei, CHEN Xi, ZHANG Su-fan. A deep learning model based on improved feature pyramid networks for small object detection [J]. Computer Engineering & Science, 2023, 45(04): 734-742.
[6]	LUO Yue-tong, DUAN Chang, JIANG Pei-feng, ZHUO Bo. An improved industrial defect data augmentation method based on pix2pix [J]. Computer Engineering & Science, 2022, 44(12): 2206-2212.
[7]	LI Lan, LIU Jie, ZHANG Jie. A complex pedestrian detection model based on improved YOLOv4 algorithm [J]. Computer Engineering & Science, 2022, 44(08): 1449-1456.
[8]	YUAN Ye, LIAO Wei. A text similarity calculation method based on multiple related information interaction [J]. Computer Engineering & Science, 2022, 44(07): 1313-1320.
[9]	LI Jing, HE Qiang, ZHANG Chang-lun, WANG Heng-you, . An indoor people counting model based on global attention [J]. Computer Engineering & Science, 2022, 44(03): 471-478.
[10]	ZHANG Si-yu1,2,ZHANG Yi1,2. Small target pedestrian detection based on multi-scale feature fusion [J]. Computer Engineering & Science, 2019, 41(09): 1627-1634.
[11]	TAO Zhu,LIU Zhengxi,XIONG Yunyu,LI Zheng. Pedestrian head detection based on deep neural networks [J]. Computer Engineering & Science, 2018, 40(08): 1475-1481.
[12]	XIE Min，YANG Pan. Related-key impossible differential cryptanalysis on ESF [J]. Computer Engineering & Science, 2018, 40(07): 1199-1205.
[13]	ZHOU Shuren1，2,WANG Gang1，2,XU Yuefeng1，2. An improved HLBP texture feature method for pedestrian detection [J]. J4, 2016, 38(05): 960-967.
[14]	YE Liren，HE Shenghong，ZHAO Lianchao. An abandoned object detection algorithm in complex environments [J]. J4, 2015, 37(05): 986-992.
[15]	JIN Shengtao1,MENG Zhaohui1,LIU Wei2. A pedestrian shadow eliminating algorithm based on blob model [J]. J4, 2014, 36(11): 2203-2209.

An improved dense pedestrian detection algorithm based on YOLOv8: MER-YOLO

PDF

Knowledge

Abstract

Cite this article

share this article

References ［36］

Related Articles 15

Recommended Articles

Metrics

Comments