• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

An FPGA-based HEVC post-processing
CNN hardware accelerator
 

XIA Jun1,Qian Lei2,YAN Wei3,CHAI Zhilei1   

  1. (1.School of Internet of Things Engineering,Jiangnan University,Wuxi 214122;
    2.State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214122;
    3.School of Software & Microelectronics,Peking University,Beijing 102600,China)
     
  • Received:2018-06-29 Revised:2018-08-18 Online:2018-12-25 Published:2018-12-25

Abstract:

Aiming at the shortcomings of the post-processing CNN algorithm running on the common platform according to the high-efficiency video code standard, we propose a postprocessing convolutional neural network hardware parallel architecture based on field programmable gate array (FPGA) to improve the overall parallelism of the convolution module and the hardware flow of the module by optimizing the concurrent data input and output buffering process. Experiments on 176×144 video streams on the Xilinx ZCU102 show that the proposed CNN hardware accelerator can achieve an equivalent computational performance of 360.5G floating-point operation per second. The computation speed can satisfy 81.01 FPS, which is 76.67 times faster than that of the Intel i7-4790K with a clock frequency of 4Ghz. The speedup is 32.50 times faster than the NVIDIA GeForce GTX 750Ti. In the calculation of energy efficiency ratio, the proposal’s power consumption is 12.095W, 512.9 times of that of the Intel i74790K and 125.78 times that of the NVIDIA GeForce GTX 750Ti.

Key words: HEVC post-processing, convolutional neural network, field programmable logic gate array(FPGA), hardware implementation