• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (07): 1141-1148.

• High Performance Computing • Previous Articles     Next Articles

Running optimization of deep learning accelerators under different pruning strategies

YI Xiao,MA Sheng,XIAO Nong   

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2021-12-08 Revised:2022-02-25 Accepted:2023-07-25 Online:2023-07-25 Published:2023-07-11

Abstract: Convolutional neural networks have achieved great success in the field of image analysis. With the development of deep learning,deep learning models are becoming more and more complex,and the amount of deep learning calculations is increasing rapidly. The sparse algorithm can effectively reduce the amount of deep learning calculations without reducing the accuracy. This paper uses three different pruning strategies under the ResNet18 model and GoogleNet model to reduce the calculation amount of the deep learning model. The results show that the global unstructured pruning strategy has a sparsity of 94% and 90% without reducing the accuracy respectively, the level unstructured pruning strategy has an average sparsity of 83% and 56% without basically reducing the accuracy respectively, and the level structured strategy has an average sparsity of 34% and 22% without basically reducing the accuracy respectively. Under the three pruning strategies, the delay and power consumption results obtained by running the sparse deep learning model in the Eyeriss deep learning accelerator shows that, compared with the unpruned strategy, under the ResNet model, the global unstructured pruning strategy has a 66.0% reduction in latency and a 60.7% reduction in power consumption, the level unstructured pruning strategy has a 66.0% reduction in delay and a 80.6% reduction in power consumption, and the level structured pruning strategy has a 65.6% reduction in latency and a 33.5% reduction in power consumption. Under the GoogleNet model, the global unstructured pruning strategy has a 74.5% reduction in latency and a 63.2% reduction in power consumption, the level unstructured pruning strategy has a 73.6% reduction in delay and a 55.0% reduction in power consumption, and the level structured pruning strategy has a 26.8% reduction in latency and a 5.8% reduction in power consumption. Therefore, this paper concludes that the global unstructured pruning strategy can greatly reduce the delay and energy consumption without reducing the accuracy. Under the level unstructured pruning strategy, the delay and energy consumption can be greatly reduced under the premise of slightly reducing the accuracy.

Key words: deep learning accelerator, convolutional neural network, pruning