• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (12): 2139-2149.

Previous Articles     Next Articles

Design and FPGA implementation of YOLOv3-tiny hardware acceleration

CHEN Hao-min1,YAO Sen-jing1,XI Yu1,ZHANG Fan1,XIN Wen-cheng1,WANG Long-hai2,REN Chao2   

  1. (1.China Southern Power Grid Digital Grid Research Institute Limited Company,Guangzhou 510623;

    2.School of Electrical Automation and Information Engineering,Tianjin University,Tianjin 300072,China)
  • Received:2020-11-10 Revised:2021-01-04 Accepted:2021-12-25 Online:2021-12-25 Published:2021-12-31

Abstract: YOLOv3-tiny has the excellent target detection capability, but the computational power required by the model is still large, so it is difficult to be used in the embedded application field. This paper proposes a hardware acceleration method of YOLOv3-tiny and implements it on FPGA platform. Firstly, for the fixed-point design of the network, with data accuracy and resource consumption as design indicators, through the statistics of the data distribution in the model and the division of data types, different fixed-point strategies are determined. Secondly, for the parallel design of the network, through the analysis of the calculation characteristics of the convolutional neural network, with the methods of loop adjustment, loop block, loop expansion, and array splitting, a scalable common hardware comput- ing unit is designed. Then, for the network pipeline design, the research is carried out from two aspects: the inter-layer and the intra-layer. Based on the direction of the inter-layer data flow and the division of tasks within the layer, a flexible pipeline computing architecture is designed. Lastly, on the XILINX XC7Z020CLG400-1 platform, experiments demonstrate that, compared with single-core ARM-A9 processor at 667MHz, the proposal achieves the calculation speed as high as 290.56. 


Key words: YOLOv3-tiny, convolutional neural network, field programmable gate array, hardware acceleration ,