• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• • 上一篇    下一篇

面向人员密集与遮挡的实时目标检测方法

盛 伟, 刘明剑, 刘殿臣   

  1. (1. 大连海洋大学信息工程学院,辽宁省 大连市 116023;
    2. 大连海洋大学设施渔业教育部重点实验室,辽宁省 大连市 116023;)

Real-Time object detection method for crowded and occluded scenes

SHENG Wei , LIU Ming-jian , LIU Dian-chen   

  1. (1. Dalian Ocean University, Dalian Liaoning 116023, China;
    2. Key Laboratory of Environment Controlled Aquaculture,Ministry of Education, Dalian Liaoning 116023, China;)

摘要: 人员密集场景的目标检测在实时系统中至关重要,但面临硬件资源有限和遮挡问题,导致检测延迟和精度下降。本文提出了一种遮挡感知轻量级目标检测网络,包括主干、特征融合和输出预测三部分。该网络使用快速网络块提取特征,并通过位置注意力机制关注遮挡边界。主干部分的特征金字塔串联汇聚模块减少信息丢失,提高对不同尺度和遮挡人员的识别能力。特征融合部分采用分组洗牌卷积,优化特征流动而不增加计算负担。输出预测部分使用任务对齐单阶段目标检测方法,提升遮挡条件下的识别准确性。实验结果显示,网络在WiderPerson数据集上的召回率达66.8%,比YOLOv8-n高2.0%,且模型参数仅1.8M,运行效率优于其他模型。在UpDown数据集上,分类错误率和未检测目标错误率分别为2.6%和1.3%,低于YOLOv8的0.4%和0.7%。实验验证了该网络在资源有限设备中的高效性。

关键词: 人员密集检测, 人员行为遮挡检测, 计算资源受限, 类间遮挡和类内遮挡, 增强位置注意力机制模块, 特征金字塔串联汇聚模块

Abstract: Target detection in crowded scenes is crucial for real-time systems but faces challenges such as limited hardware resources and occlusion issues, leading to delays and reduced accuracy. This paper proposes an Occlusion-aware Lightweight Object Detection Network(OLODN), consisting of three components: Backbone, Neck, and Head. The network uses FasterNet Block for feature extraction and employs Reinforced Coordination Attention to focus on occlusion boundaries. The Backbone incorporates Spatial Pyramid Pooling Feature Concatenation to minimize information loss and enhance recognition of various scales and occluded objects. The Neck uses Grouped Shuffle Convolution to improve feature flow and integration without additional computational burden. The Head adopts Task-aligned One-stage Object Detection to enhance object recognition accuracy under occlusion conditions. Experimental results show that OLODN achieves a recall rate of 66.8% on the WiderPerson dataset, 2.0% higher than YOLOv8-n, with a model size of only 1.8M, and performs more efficiently than other models. On the UpDown dataset, the classification error rate and missed detection rate are 2.6% and 1.3%, respectively, lower than YOLOv8’s 0.4% and 0.7%. The experiments confirm the efficiency of OLODN in handling occlusion issues on resource-constrained devices.

Key words: Crowd Detection, Occlusion Detection in Human Behavior, Resource-constrained Computing, Inter-class and Intra-class Occlusion, RCA Attention Mechanism, SPPFC Module.