• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Parallel optimization for deep learning
based on HPC environment

CHEN Mengqiang,YAN Zijie,YE Yan,WU Weigang   

  1. (School of Data and Computer Science,Sun Yatsen University,Guangzhou 510006,China)
  • Received:2018-07-20 Revised:2018-09-27 Online:2018-11-26 Published:2018-11-25

Abstract:

Deep learning technology has been widely applied for various purposes, especially big data analysis. However, computation demand for deep learning is getting more complex and larger. In order to accelerate the training of largescale deep networks, various distributed parallel training protocols have been proposed. We design a novel asynchronous training protocol, called weighted asynchronous parallel protocol (WASP), to update neural network parameters in a more effective way. The core of WASP is how to deal with “gradient staleness”, a parameter version number based metric to weight gradients and reduce the influence of the stale gradient on parameters. Moreover, by periodic forced synchronization of model parameters, the WASP combines the advantages of synchronous and asynchronous training models and can speed up training with a rapid convergence rate. We conduct experiments on the Tianhe2 supercomputing system using two classical convolutional neural networks, LeNet5 and ResNet101, and the results show that the WASP can achieve a much higher speedup and a more stable convergence than existing asynchronous parallel training protocols.
 

Key words: deep learning, distributed parallelization, Tianhe2, parameter server, staleness