• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (07): 1278-1285.

• 图形与图像 • 上一篇    下一篇

基于多尺度特征与互监督的拥挤行人检测

肖振久,李思琦,曲海成   

  1. (辽宁工程技术大学软件学院,辽宁 葫芦岛 125105)

  • 收稿日期:2023-05-25 修回日期:2023-09-19 接受日期:2024-07-25 出版日期:2024-07-25 发布日期:2024-07-19
  • 基金资助:
    辽宁省高等学校基本科研项目(LJKMZ20220699);辽宁工程技术大学学科创新团队(LNTU20TD-23)

Pedestrian detection based on multi-scale features and  mutual supervision

XIAO Zhen-jiu,LI Si-qi,QU Hai-cheng   

  1. (College of Software,Liaoning Technical University,Huludao 125105,China)
  • Received:2023-05-25 Revised:2023-09-19 Accepted:2024-07-25 Online:2024-07-25 Published:2024-07-19

摘要:  针对拥挤场景中,行人漏检率高、准确率低的问题,提出一种基于多尺度特征与互监督的拥挤行人检测网络。为了有效提取复杂场景中的行人特征信息,用PANet金字塔网络与混合空洞卷积相结合的网络提取特征信息。然后,设计了一种行人头部-全身互监督检测网络分别进行头部和全身检测,利用头部预测框和全身预测框的互监督获得更加准确的行人检测结果。所提出的网络在数据集CrowdHuman上取得了13.5%的MR-2性能,相较于YOLOv5网络提升了3.6%,同时AP提升了3.5%;在CityPersons数据集上取得了48.2%的MR-2性能,相较于YOLOv5网络提升了2.3%,同时AP提升了2.8%。实验结果表明,提出的网络在人群拥挤的密集场景中具有良好的检测效果。

关键词: 拥挤场景, 行人检测, 多尺度网络, 互监督

Abstract: Aiming at the high false negative rate and low accuracy in crowded scenes, a pedestrian detection network based on multi-scale features and mutual  supervision is proposed. To effectively extract pedestrian feature information in complex scenes, a network combining PANet pyramid network and mixed dilated convolutions is used to extract feature information. Then, a mutual supervision detection network for head-body detection is designed, which utilizes the mutual supervision of head bounding boxes and full-body bounding boxes to obtain more accurate pedestrian detection results. The proposed network achieves 13.5% MR-2 performance on CrowdHuman dataset, with an improvement of 3.6% compared to the YOLOv5 network, and a simultaneous improvement of 3.5% in average precision (AP). On CityPersons dataset, it achieves 48.2% MR-2 performance, with 2.3% improvement compared to the YOLOv5 network, and a simultaneous improvement of 2.8% in AP. The results indicate that the proposed network demonstrates good detection performance in densely crowded scenes.

Key words: crowded scene, pedestrian detection, multi-scale network, mutual supervision

中图分类号: