Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (04): 580-589.
• High Performance Computing • Previous Articles Next Articles
CHEN Jie,LI Cheng,LIU Zhong
Received:
Revised:
Accepted:
Online:
Published:
Abstract: With the widespread application of deep learning, represented by convolutional neural networks (CNNs), the computational requirements of neural network models have increased rapidly, driving the development of deep learning accelerators. The research focus has shifted to how to accelerate and optimize the performance of neural network models based on the architectural characteristics of accelerators. For the VGG network model inference and training algorithms on the independently designed multi core vector accelerator FT-M7004, vectorized mapping methods for core operators such as convolution, pooling, and fully connected layers are proposed. Optimization strategies, including SIMD vectorization, DMA double-buffered transfer, and weight sharing, are employed to fully exploit the architectural advantages of the vector accelerator, achieving high computational efficiency. Experimental results indicate that on the FT-M7004 platform, the average computational efficiency for convolution layer inference and training is 86.62% and 69.63%, respectively; for fully connected layer inference and training, the average computational efficiency reaches 93.17% and 81.98%, respectively. The inference computational efficiency of the VGG network model on FT-M7004 exceeds that on the GPU platform by over 20%.
Key words: multicore vector accelerator, convolutional neural network, inference algorithm, training algorithm
CHEN Jie, LI Cheng, LIU Zhong. Convolutional neural network inference and training vectorization method for multicore vector accelerators[J]. Computer Engineering & Science, 2024, 46(04): 580-589.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2024/V46/I04/580