• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (07): 1216-1225.

• Graphics and Images • Previous Articles     Next Articles

An online multi-pedestrian tracking method with Mask R-CNN

CAO Yu-dong1,CHEN Dong-hao1,CAO Rui2,ZHAO Lang1   

  1. (1.School of Electronics and Information Engineering,Liaoning University of Technology,Jinzhou 121001;
    2.School of Automation and Electrical Engineering,Dalian Jiaotong University,Dalian 116028,China)
  • Received:2021-11-09 Revised:2022-05-04 Accepted:2023-07-25 Online:2023-07-25 Published:2023-07-11

Abstract: Pedestrian object detection and tracking have attracted much attention in the computer vision field. An improved multi-pedestrian tracking model is proposed, which improves the basic framework of Deep SORT and integrates Mask R-CNN to realize the detection, tracking and pose estimation of pedestrian. The anchor boxes with the more suitable aspect ratio for pedestrian target are adopted, which replace the anchor boxes of RPN to speed up the model and improve performance without complex calculation. In addition, attention mechanism is introduced into the deep residual network, i.e., the lightweight SKNet is used to choose the best convolution kernel adaptively to improve the feature representation for target detection. The histogram of gradient feature combined with color information is adopted instead of the convolution feature, which improves appearance feature association matching in the Deep SORT model so as to track pedestrian targets effectively under occlusion. The impact of various improvements on the model are verified through ablation studies, and the proposed model is compared with the current mainstream model. Experimental results show that the improved models are effective, which improves MOTA of NSH by 6% on the MOT16 tracking data set. The test performance of our proposal on the public datasets is superior to that of the compared models. The proposed model can still track pedestrian targets effectively when the background moves or pedestrian targets are occluded.  

Key words: pedestrian detection, pedestrian tracking, pose estimation