• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (11): 2043-2048.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于阿当姆斯捷径连接的深度神经网络模型压缩方法

杜鹏,李超,石剑平,姜麟   

  1. (昆明理工大学理学院,云南 昆明 650500)
  • 收稿日期:2020-04-03 修回日期:2020-09-07 接受日期:2021-11-25 出版日期:2021-11-25 发布日期:2021-11-23
  • 基金资助:
    国家自然科学基金(11561034);云南省教育厅基金(KKJB201707008)

A deep neural network model compression method based on Adams shortcut connection

DU Peng,LI Chao,SHI Jian-ping,JIANG Lin#br#

#br#
  

  1. (Faculty of Science,Kunming University of Science and Technology,Kunming 650500,China)
  • Received:2020-04-03 Revised:2020-09-07 Accepted:2021-11-25 Online:2021-11-25 Published:2021-11-23

摘要: 深度神经网络已经在各类计算机视觉任务中取得了很大的成功,可网络结构设计仍缺乏指导性的准则。大量的理论和经验证据表明,神经网络的深度是它们成功的关键,而深度神经网络的可训练性仍是待解决的问题。将微分方程数值解法——阿当姆斯(Adams)法用于深度神经网络的权重学习,提出一种基于阿当姆斯法的捷径连接(shortcut connection)方式,可提高网络后期的学习精度,压缩模型的规模,使模型变得更有效。尤其对网络层数较少的深度神经网络的可训练性优化效果更明显。以经典的ResNet为例,比较了使用基于Adams法的捷径连接方式的Adams-ResNet与源模型在Cifar10上的性能表现,所提方法在提高识别正确率的同时将源模型的参数量压缩至一半。


关键词: 深度神经网络, 微分方程数值解法, 阿当姆斯法, 捷径连接

Abstract: Deep neural network has made a great breakthrough in all kinds of computer vision tasks, but there is still a lack of guiding principles in network structure design. A great deal of theoretical and empirical evidence shows that the depth of neural networks is the key to their success, but the trainability of neural networks remains to be solved. In this paper, the numerical method of differential equations (Adams) is used in the weight learning of deep neural network, and a shortcut connection based on Adams method is proposed to improve the learning accuracy of late network, compress the size of model, and make the model more effective. In particular, the trainability optimization effect is obvious for the deep neural network with a small number of layers. Taking the classic ResNet as an example, this paper compares the performance between Adams-ResNet, which uses the shortcut connection based on Adams method, and the source model on Cifar10. The former improves the recognition accuracy while reducing the parameters of the source model to half.

Key words: deep neural network, numerical method of differential equation, Adams method, shortcut connection