• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 •    下一篇

基于申威众核处理器的HOG特征提取算法并行加速

赵美婷1,2,刘轶1,2,刘锐1,2,宋凯达1,2,钱德沛1,2   

  1. (1.北京航空航天大学计算机学院,北京 100191;2.数学工程与先进计算国家重点实验室,江苏 无锡 214215)
  • 收稿日期:2016-12-09 修回日期:2017-02-21 出版日期:2017-04-25 发布日期:2017-04-25
  • 基金资助:

    国家863计划(2014AA01A301)

Acceleration of histogram of oriented gradient (HOG)
based on Sunway manycore processor
 

ZHAO Mei-ting1,2,LIU Yi1,2,LIU Rui1,2,SONG Kai-da1,2,QIAN De-pei1,2   

  1. (1.School of Computer Science and Engineering,Beihang University,Beijing 100191;
    2.State Key Lab of Mathematical Engineering and Advanced Computing,Wuxi 214215,China)
     
  • Received:2016-12-09 Revised:2017-02-21 Online:2017-04-25 Published:2017-04-25

摘要:

HOG特征是一种简单高效的常用来进行物体检测的特征描述子,广泛应用于行人检测等领域,然而在处理海量图片时却面临着严峻的性能挑战。解决方法之一就是通过使用“神威太湖之光”超级计算机的处理器节点对海量图像背景下的行人检测算法进行加速。主要采用了两种并行方案:一种是一个处理器同时处理4张图片,另一种是同时处理256张图片。大量的串行和并行处理的实验测试结果表明,对高分辨率多幅图像的并行处理可采用第一种方案,加速比可达83倍;对低分辨率图像可采用第二种方案,加速比最高可达到95。两种并行设计方案在“神威太湖之光”的多处理器节点上具有很好的可扩展性能。

关键词: HOG特征提取, 神威太湖之光, 申威SW26010, 并行实现

Abstract:

HOG features are a simple and efficient feature descriptor commonly used for object detection. It is widely used in pedestrian detection and other fields. However, they face severe performance challenges when dealing with massive images. One of the solutions is to speed up the pedestrian detection algorithm in the context of mass images by using the Sunway SW26010 processor nodes of the SunwayTaihuLight supercomputer. We propose two methods of parallel implementation: one method is that a processor processes 4 images simultaneously, and the other is that 256 images are processed at the same time. Through a large number of serial and parallel processing experimental tests, the results show that  the first parallel implementation method can be used to process highresolution images and the speedup can reach up to 83; the second parallel implementation method can be used to process lowresolution images and the maximum speedup is 95. The results on multinode processors show that our parallel implementation methods have good scalability. 

Key words: histogram of oriented gradient feature extraction, Sunway TaihuLight, Sunway SW26010, parallel implementation