Design and FPGA implementation of YOLOv3-tiny hardware acceleration

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (12): 2139-2149.

Previous Articles Next Articles

Design and FPGA implementation of YOLOv3-tiny hardware acceleration

CHEN Hao-min1,YAO Sen-jing1,XI Yu1,ZHANG Fan1,XIN Wen-cheng1,WANG Long-hai2,REN Chao2

(1.China Southern Power Grid Digital Grid Research Institute Limited Company,Guangzhou 510623;

2.School of Electrical Automation and Information Engineering,Tianjin University,Tianjin 300072,China）

Received:2020-11-10 Revised:2021-01-04 Online:2021-12-25 Published:2021-12-31

Abstract

Abstract: YOLOv3-tiny has the excellent target detection capability, but the computational power required by the model is still large, so it is difficult to be used in the embedded application field. This paper proposes a hardware acceleration method of YOLOv3-tiny and implements it on FPGA platform. Firstly, for the fixed-point design of the network, with data accuracy and resource consumption as design indicators, through the statistics of the data distribution in the model and the division of data types, different fixed-point strategies are determined. Secondly, for the parallel design of the network, through the analysis of the calculation characteristics of the convolutional neural network, with the methods of loop adjustment, loop block, loop expansion, and array splitting, a scalable common hardware comput- ing unit is designed. Then, for the network pipeline design, the research is carried out from two aspects: the inter-layer and the intra-layer. Based on the direction of the inter-layer data flow and the division of tasks within the layer, a flexible pipeline computing architecture is designed. Lastly, on the XILINX XC7Z020CLG400-1 platform, experiments demonstrate that, compared with single-core ARM-A9 processor at 667MHz, the proposal achieves the calculation speed as high as 290.56.

Key words: YOLOv3-tiny, convolutional neural network, field programmable gate array, hardware acceleration ,

CHEN Hao-min, YAO Sen-jing, XI Yu, ZHANG Fan, XIN Wen-cheng, WANG Long-hai, REN Chao. Design and FPGA implementation of YOLOv3-tiny hardware acceleration[J]. Computer Engineering & Science, 2021, 43(12): 2139-2149.

[1]	TANG Tao, JIANG Hao, PENG Lin, QI Haijun, LU Qingfeng. Reproducible matrix decomposition on domestic chip [J]. Computer Engineering & Science, 2025, 47(05): 761-774.
[2]	LI Junzhe, FU Zhenxin, YANG Honghui, MA Yinping, LI Ruomiao, FAN Chun, . Design and implementation of a cross-cluster data migration system for computational networks [J]. Computer Engineering & Science, 2025, 47(05): 775-786.
[3]	XIE Yang, LI Chen, CHEN Xiaowen. A near-data processing architecture for data-intensive applications [J]. Computer Engineering & Science, 2025, 47(05): 797-810.
[4]	CHEN Chuyi, LUO Xiongfei, YAN Baotong, FENG Yuxuan, MA Ke, QIAO Ying. An adaptive cache management method for multi-layer recursive DNS [J]. Computer Engineering & Science, 2025, 47(05): 823-831.
[5]	CHEN Xu, CHEN Zixiong, JING Yongjun, WANG Shuyang, SONG Jifei. A slice-level vulnerability detection method based on hyperbolic graph convolutional neural network [J]. Computer Engineering & Science, 2025, 47(05): 851-863.
[6]	WANG Ying, YANG Qing , WANG Xiangyu , ZHANG Yong, . Research on EEG signal emotion analysis based on asymmetric spatial features [J]. Computer Engineering & Science, 2025, 47(05): 921-930.
[7]	LI Zhenqi, WANG Qiang, QI Xingyun, LAI Mingche, ZHAO Yankang, LU Yihang, LI Yuan. Design and FPGA implementation of lightweight convolutional neural network hardware acceleration [J]. Computer Engineering & Science, 2025, 47(04): 582-591.
[8]	LIANG Jiajie, XU Huiying, ZHU Xinzhong, WANG Shumeng, LIU Ziyang, LI Chen. An improved marine animal object detection algorithm based on YOLOv8n: DPSC-YOLO [J]. Computer Engineering & Science, 2025, 47(04): 695-705.
[9]	WANG Yuheng, LIU Qiang, WU Xiaojie. RCGNN: Robustness certification for graph neural networks under graph injection attacks [J]. Computer Engineering & Science, 2025, 47(03): 434-447.
[10]	XIE Bin, LI Yanwei, YANG Shumin, XU Yan, WANG Guanchao. Emotional color transfer combining image decomposition and self-sparse fuzzy clustering [J]. Computer Engineering & Science, 2025, 47(03): 513-523.
[11]	TIAN Xi, LI Tun, CHENG Yue, PI Yan, ZOU Hongji. GPU-accelerated RTL simulation with Loop unrolling [J]. Computer Engineering & Science, 2025, 47(02): 191-199.
[12]	SHEN Jie, LONG Biao, HUANG Chun, TANG Tao, PENG Lin. Optimization of exponential and logarithm functions for vector units [J]. Computer Engineering & Science, 2025, 47(01): 18-26.
[13]	TANG Zhu, CHEN Baohai, WANG Jingyu, ZHU Qi. OpenOCD debugging optimization for isomorphic asymmetric multi-core architecture [J]. Computer Engineering & Science, 2025, 47(01): 45-55.
[14]	GAO Yingying, TIAN Ye. An image encryption algorithm based on fractional 2D-TFCDM mapping and improved Hilbert curve scrambling [J]. Computer Engineering & Science, 2025, 47(01): 66-74.
[15]	CHEN Xiao-wen, RUI Zhi-chao, ZHU Qi-jin, DONG Yu, MENG Yu, . Design and FPGA implementation of a high-precision double step branching hybrid CORDIC algorithm [J]. Computer Engineering & Science, 2024, 46(12): 2099-2108.

Design and FPGA implementation of YOLOv3-tiny hardware acceleration

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments