• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于深度神经网络的行人头部检测

陶祝,刘正熙,熊运余,李征   

  1. (四川大学计算机学院,四川 成都 610000)
  • 收稿日期:2017-06-14 修回日期:2017-08-15 出版日期:2018-08-25 发布日期:2018-08-25
  • 基金资助:

    国家自然科学基金(61471250)

Pedestrian head detection based on deep neural networks

TAO Zhu,LIU Zhengxi,XIONG Yunyu,LI Zheng   

  1. (College of Computer,Sichuan University,Chengdu 610000,China)
  • Received:2017-06-14 Revised:2017-08-15 Online:2018-08-25 Published:2018-08-25

摘要:

行人检测已成为安防、智能视频监控、景区人流量统计所依赖的核心技术,最新目标检测方法包括快速的区域卷积神经网络FastRCNN、单发多重检测器
SSD、部分形变模型DPM等,皆为对行人整体的检测。在大场景下,行人姿态各异,物体间遮挡频繁,只有通过对行人身体部分位置建模,抓住人的局部特征,才能实现准确的定位。利用FasterRCNN深度网络原型,针对行人头部建立检测模型,同时提取行人不同方向的头部特征,并加入空间金字塔池化层,保证检测速率,有效解决大场景下行人的部分遮挡问题,同时清晰地显示人群大致流动方向,相比普通的人头估计,更有利于人流量统计。
 
 

关键词: 视频分析, 行人检测, 卷积神经网络, FasterRCNN, 空间金字塔池化层

Abstract:

Pedestrian detection has become the core technology that security, intelligent video surveillance, and traffic statistics of people in the scenic area depend on. The latest object detection methods such as FastRegions with Convolution Neural Network (FastRCNN), Faster RCNN, Single Shot Multibox Detector (SSD), Deformable Part Models (DPM) are currently the classic algorithms for object detection. However, these algorithms pay more attention to detect the whole pedestrians. In large scenes, pedestrians have different postures and some of them are occluded frequently. Only modeling the position of the pedestrian’s body and grasping the local features of the pedestrians can achieve accurate positioning. The FasterRCNN deep network prototype is adopted, a detection model is built for pedestrian heads, head features in different directions are extracted at the same time, and a spatial pyramid pooling layer is added to ensure the detection rate. These can effectively solve the partial occlusion problem of pedestrians in large scenes and clearly show the general flow direction of pedestrians. The proposal is more conducive to the flow statistics than the ordinary head estimation.
 
 

Key words: video analysis, pedestrian detection, convolution neural network, Faster-RCNN, spatial pyramid pooling layer