• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (06): 1050-1062.

• 图形与图像 • 上一篇    下一篇

基于YOLOv8改进的密集行人检测算法:MER-YOLO

王泽宇1,徐慧英1,朱信忠1,李琛1,刘子洋1,王子奕2   

  1. (1.浙江师范大学计算机科学与技术学院,浙江 金华 321004;2.曼彻斯特大学理工学院计算机系,英国 曼彻斯特M139PL)
  • 收稿日期:2023-08-05 修回日期:2023-10-15 接受日期:2024-06-25 出版日期:2024-06-25 发布日期:2024-06-18
  • 基金资助:
    国家自然科学基金(62376252,61976196);浙江省自然科学基金(LZ22F030003);国家级大学生创新训练计划重点项目(202310345042)

An improved dense pedestrian detection algorithm based on YOLOv8: MER-YOLO

WANG  Ze-yu1,XU Hui-ying1,ZHU Xin-zhong1,LI Chen1,LIU Zi-yang1,WANG Zi-yi2   

  1. (1.College of Computer Science and Technology,Zhejiang Normal University,Jinhua 321004,China;
    2.Computer Science Department,School of Engineering,The University of Manchester,Manchester M139PL,UK)

  • Received:2023-08-05 Revised:2023-10-15 Accepted:2024-06-25 Online:2024-06-25 Published:2024-06-18

摘要: 在大型人员密集的场所中,人群异常聚集情况时有发生,对自动驾驶和大型公共场所人流量监控系统等应用场景中涉及到的密集行人检测技术带来了一定挑战,新一代的密集行人检测技术要求精确度更高、计算开销更小、检测速度更快以及部署更加方便等。针对上述需求,提出了一种基于YOLOv8改进的轻量级密集行人检测算法MER-YOLO,首先采用MobileViT作为主干网络,提升模型在总体上对行人聚集区域的特征提取能力;引入EMA注意力机制模块,对全局信息进行编码,通过维度交互来进一步聚合像素级特征,并结合160×160尺度的检测头加强小目标检测能力;使用排斥损失(Repulsion Loss)作为边界框损失函数,减少了人群密集情况下小目标行人的漏检误检的情况。实验结果表明,相较于YOLOv8n,MER-YOLO行人检测算法在Crowd Human数据集上mAP@0.5提升了4.5%,在WiderPerson数据集上mAP@0.5提升了2.1%,同时只有3.1×106的参数量和9.8 GFLOPs,满足低算力兼顾高精度的部署需求。

关键词: 目标检测, 行人检测, 轻量化, 注意力机制

Abstract: In large-scale crowded places, abnormal crowd gathering occurs from time to time, which brings certain challenges to the dense pedestrian detection technology involved in application scenarios such as autonomous driving and large-scale public place crowd monitoring systems. The new generation of dense pedestrian detection technology requires higher accuracy, smaller computing overhead, faster detection speed and more convenient deployment. In view of the above requirements, a lightweight dense pedestrian detection algorithm MER-YOLO based on YOLOv8 is proposed, which first uses MobileViT as the backbone network to improve the overall feature extraction ability of the model in pedestrian gathering areas. The EMA attention mechanism module is introduced to encode the global information, further aggregate pixel-level features through dimensional interaction, and strengthen the detection ability of small targets by combining the detection head with 160×160 scale. The use of Repulsion Loss as the bounding box loss function reduces the missed detection and misdetection of small target pedestrians under dense crowds. The experimental results show that compared with YOLOv8n, the mAP@0.5 of the MER-YOLO pedestrian detection algorithm is improved by 4.5% on the Crowd Human dataset and 2.1% on the WiderPerson dataset, while only 3.1×106 parameters and 9.8 GFLOPs, which meet the deployment requirements of low computing power and high precision.

Key words: object detection, pedestrian detection, light weight, attention mechanism

中图分类号: