• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 高性能计算 • 上一篇    下一篇

通过K-means算法实现神经网络的加速和压缩

陈桂林,马胜,郭阳,李艺煌,徐睿   

  1. (国防科技大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2018-08-05 修回日期:2018-10-16 出版日期:2019-05-25 发布日期:2019-05-25
  • 基金资助:

    国家自然科学基金(61672526);国防科技大学科研计划(ZK170306)

Towards convolutional neural network acceleration
and compression via K-means algotrithm

CHEN Guilin,MA Sheng,GUO Yang,LI Yihuang,XU Rui   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China)
     
  • Received:2018-08-05 Revised:2018-10-16 Online:2019-05-25 Published:2019-05-25

摘要:

近年来,以神经网络为代表的机器学习领域快速发展,已广泛地应用于语音识别和图像识别等各个工业领域。随着应用环境越来越复杂,精度要求越来越高,网络规模也越来越大。大规模神经网络既是计算密集型又是存储密集型结构,其中卷积层是计算密集型层次,全连接层是存储密集型层次。前者的处理速度跟不上存取速度,后者的存取速度跟不上处理速度。基于神经网络本身训练的预测准确率置信区间,提出了一种使用Kmeans加速和压缩神经网络的方法。通过将卷积过程中的输入特征图采用Kmeans压缩来减少计算量;通过将全连接层的权重压缩来减少存储量。所提方法对AlexNet网络单个卷积层的计算量最多能降低2个数量级,加入合适的Kmeans层,整个网络的处理时间加速比能达到2.077,对网络压缩率达到8.7%。
 
 

关键词: 神经网络, 置信区间, 加速, 聚类压缩

Abstract:

In recent years, the field of machine learning develops rapidly. As a typical representative, neural networks are widely used in various industrial fields, such as speech recognition and image recognition. As the environment of application becomes more complex, the accuracy requirements become higher, and the network scale becomes larger. Large-scale neural networks are both computationintensive and storage-intensive. The convolutional layer is computationintensive and the fully connected layer is storage-intensive. The processing speed of the former cannot keep up with its memory access speed, while the access speed of the later cannot keep up with its processing speed. Based on the confidence interval of the prediction accuracy of neural network training, we propose a neural network acceleration and compression method using the K-means algorithm. We reduce the amount of calculation by compressing the input feature map during the convolution process; and reduce the amount of storage by compressing the weight of the fully connected layer. The proposed method can greatly reduce the calculation amount of a single convolution layer of AlexNet network by up to 100 times. By adding appropriate K-means layer, the speedup of the processing time of the whole network can reach 2.077, and the network compression can reach 8.7%.
 
 

Key words: neural network, confidence interval, acceleration, cluster compression