• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (4): 582-591.

• High Performance Computing • Previous Articles     Next Articles

Design and FPGA implementation of lightweight convolutional neural network hardware acceleration

LI Zhenqi,WANG Qiang,QI Xingyun,LAI Mingche,ZHAO Yankang,LU Yihang,LI Yuan   

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2023-08-18 Revised:2024-07-09 Online:2025-04-25 Published:2025-04-17

Abstract: In recent years, convolutional neural networks (CNNs) have achieved remarkable results in fields such as computer vision. However, CNNs typically have complex network structures and substantial computational requirements, making it difficult to implement them on portable devices with limited computational resources and power consumption. FPGAs, with their high parallelism, energy efficiency, and reconfigurability, have emerged as one of the most effective computing platforms for accele- rating CNN inference on portable devices. This paper proposes a CNN accelerator that can be configured for different network structures, and optimizes its latency and power consumption through three aspects: data reuse, pipeline optimization based on row buffers, and low-latency convolution techniques based on adder trees. Taking the YOLOv2-tiny lightweight network model as an example, a real-time target detection system was built on the Navigator ZYNQ-7020 development board. The experimental results show that the design meets low hardware and power requirements for portable devices, with 88% resource consumption and 2.959 W power consumption. It achieves a detection speed of 3.91 fps at an image resolution of  416×256.

Key words: convolutional neural network (CNN), FPGA acceleration, accelerator, portable device