Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (4): 582-591.
• High Performance Computing • Previous Articles Next Articles
LI Zhenqi,WANG Qiang,QI Xingyun,LAI Mingche,ZHAO Yankang,LU Yihang,LI Yuan
Received:
Revised:
Online:
Published:
Abstract: In recent years, convolutional neural networks (CNNs) have achieved remarkable results in fields such as computer vision. However, CNNs typically have complex network structures and substantial computational requirements, making it difficult to implement them on portable devices with limited computational resources and power consumption. FPGAs, with their high parallelism, energy efficiency, and reconfigurability, have emerged as one of the most effective computing platforms for accele- rating CNN inference on portable devices. This paper proposes a CNN accelerator that can be configured for different network structures, and optimizes its latency and power consumption through three aspects: data reuse, pipeline optimization based on row buffers, and low-latency convolution techniques based on adder trees. Taking the YOLOv2-tiny lightweight network model as an example, a real-time target detection system was built on the Navigator ZYNQ-7020 development board. The experimental results show that the design meets low hardware and power requirements for portable devices, with 88% resource consumption and 2.959 W power consumption. It achieves a detection speed of 3.91 fps at an image resolution of 416×256.
Key words: convolutional neural network (CNN), FPGA acceleration, accelerator, portable device
LI Zhenqi, WANG Qiang, QI Xingyun, LAI Mingche, ZHAO Yankang, LU Yihang, LI Yuan. Design and FPGA implementation of lightweight convolutional neural network hardware acceleration[J]. Computer Engineering & Science, 2025, 47(4): 582-591.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2025/V47/I4/582