• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Small target pedestrian detection
based on multi-scale feature fusion
 

ZHANG Si-yu1,2,ZHANG Yi1,2   

  1. (1.College of Computer Science,Sichuan University,Chengdu 610065;
    2.National Key Laboratory of Fundamental Science on Synthetic Vision,Sichuan University,Chengdu 610065,China)
  • Received:2019-01-25 Revised:2019-04-24 Online:2019-09-25 Published:2019-09-25

Abstract:

Given the problems of missing detection and detection failure for small targets in the single shot multibox detector (SSD), we propose an hourglass SSD model based on the idea of deconvolution and feature fusion, called hgSSD model. It deconvolutes the conventional SSD feature, which is then combined with shallower features to detect small target pedestrians in complex scenes. In order to preserve shallow network characteristics, ensure real-time detection and save computing resources, we use the VGG-16 instead of the deeper RestNet-101 as the basic network. In order to enhance the detection of small targets, Conv3_3 in VGG16 is improved as the feature layer added into the training. The fused network is more complex than the conventional SSD, but the real-time performance is basically guaranteed. It can successfully detect most of the small targets that are missed by the conventional SSD network, and the network has a higher accuracy than the conventional SSD model. In the case where the default box confidence threshold of 0.3, it basically detects the small targets undetected by the conventional SSD. In VOC  2007+2012, the pedestrian average precision value is increased from 0.765 to 0.83 in comparison with the conventional SSD.
 

Key words: small target pedestrian detection, multi-scale prediction, feature fusion, deconvolutional neural network, deep learning