Design and FPGA implementation of YOLOv3-tiny hardware acceleration

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (12): 2139-2149.

Previous Articles Next Articles

Design and FPGA implementation of YOLOv3-tiny hardware acceleration

CHEN Hao-min1,YAO Sen-jing1,XI Yu1,ZHANG Fan1,XIN Wen-cheng1,WANG Long-hai2,REN Chao2

(1.China Southern Power Grid Digital Grid Research Institute Limited Company,Guangzhou 510623;

2.School of Electrical Automation and Information Engineering,Tianjin University,Tianjin 300072,China）

Received:2020-11-10 Revised:2021-01-04 Accepted:2021-12-25 Online:2021-12-25 Published:2021-12-31

Abstract

Abstract: YOLOv3-tiny has the excellent target detection capability, but the computational power required by the model is still large, so it is difficult to be used in the embedded application field. This paper proposes a hardware acceleration method of YOLOv3-tiny and implements it on FPGA platform. Firstly, for the fixed-point design of the network, with data accuracy and resource consumption as design indicators, through the statistics of the data distribution in the model and the division of data types, different fixed-point strategies are determined. Secondly, for the parallel design of the network, through the analysis of the calculation characteristics of the convolutional neural network, with the methods of loop adjustment, loop block, loop expansion, and array splitting, a scalable common hardware comput- ing unit is designed. Then, for the network pipeline design, the research is carried out from two aspects: the inter-layer and the intra-layer. Based on the direction of the inter-layer data flow and the division of tasks within the layer, a flexible pipeline computing architecture is designed. Lastly, on the XILINX XC7Z020CLG400-1 platform, experiments demonstrate that, compared with single-core ARM-A9 processor at 667MHz, the proposal achieves the calculation speed as high as 290.56.

Key words: YOLOv3-tiny, convolutional neural network, field programmable gate array, hardware acceleration ,

CHEN Hao-min, YAO Sen-jing, XI Yu, ZHANG Fan, XIN Wen-cheng, WANG Long-hai, REN Chao. Design and FPGA implementation of YOLOv3-tiny hardware acceleration[J]. Computer Engineering & Science, 2021, 43(12): 2139-2149.

[1]	KANG Yu, SHI Ke-hao, CHEN Jia-yi, CAO Yang, XU Zhen-yi, . Transversal cameras relocation for moving object based on metric learning [J]. Computer Engineering & Science, 2024, 46(07): 1256-1268.
[2]	TIAN Hong-peng, WU Jing-wei. RIB-NER:A span-based Chinese named entity recognition model [J]. Computer Engineering & Science, 2024, 46(07): 1311-1320.
[3]	YUAN Heng-zhou, , SANG Hao, LIU Sheng, CHEN Xiao-wen, YAN Guang-da, GUO Yang, . A high-precision oscillator noise analysis model of ISF based on PSS+PXF [J]. Computer Engineering & Science, 2024, 46(06): 951-958.
[4]	CHEN Tian-yu, LI Chuan, WANG Yan-hui. Design of high-speed BGA and PCB transmission structure for extended Chiplet application [J]. Computer Engineering & Science, 2024, 46(06): 976-983.
[5]	WANG Lei, LIU Ran-ran. An EDAS decision making method and its application based on a novel picture fuzzy distance [J]. Computer Engineering & Science, 2024, 46(06): 1101-1111.
[6]	XU Hui-ling, LIU Sheng, LI An-dong. A heterogeneous guided whale optimization algorithm based on forward-reverse local exploitation and the golden sine algorithm [J]. Computer Engineering & Science, 2024, 46(06): 1128-1140.
[7]	WEI Yi, YANG Zhi-jie, TIE Jun-bo, SHI Wei, ZHOU Li, WANG Yao, WANG Lei, XU Wei-xia. A multistage dynamic branch predictor based on Hummingbird E203 [J]. Computer Engineering & Science, 2024, 46(05): 785-793.
[8]	YIN Chun-yong, ZHAO Feng. An anomaly detection model of time series based on dual attention and deep autoencoder [J]. Computer Engineering & Science, 2024, 46(05): 826-835.
[9]	ZHAO Qian-he, WANG Rui, . ELPVO: A ultra-low power visual odometry based on I/O optimization [J]. Computer Engineering & Science, 2024, 46(05): 846-851.
[10]	ZHAO Jin-yuan, JIA Di. A multi-person pose estimation correction algorithm based on improved YOLOv5 [J]. Computer Engineering & Science, 2024, 46(05): 852-860.
[11]	TONG Yuan, YAO Nian-min. Entity relation extraction based on prejudgment and multi-round classification for span [J]. Computer Engineering & Science, 2024, 46(05): 916-928.
[12]	MA Chang-lin, SUN Zhuang. Distantly supervised relation extraction based on entity knowledge [J]. Computer Engineering & Science, 2024, 46(05): 945-950.
[13]	CHEN Jie, LI Cheng, LIU Zhong. Convolutional neural network inference and training vectorization method for multicore vector accelerators [J]. Computer Engineering & Science, 2024, 46(04): 580-589.
[14]	REN Bo-lin, XIAO Li-quan, QI Xing-yun, ZHANG Geng, WANG Qiang, LUO Zhang, PANG Zheng-bin, XU Jia-qing. A low-power transmitter driver for die to die [J]. Computer Engineering & Science, 2024, 46(04): 599-605.
[15]	CAO Hao-dong, WANG Hai-tao, HE Jian-fen. Date-aware sequential recommendation fusing local information of sequences [J]. Computer Engineering & Science, 2024, 46(04): 734-742.

Design and FPGA implementation of YOLOv3-tiny hardware acceleration

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments