• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (11): 1949-1959.

• High Performance Computing • Previous Articles     Next Articles

A heterogeneous differential synchronous parallel training algorithm

HUANG Shan1,2,3,WU Yu-fan1,2,3,L He-xuan1,2,3,DUAN Xiao-dong1,2,3   

  1. (1.College of Computer Science and Technology,Dalian Minzu University,Dalian 116650;
    2.State Ethnic Affairs Commission Key Laboratory of Big Data Applied Technology,Dalian 116650;
    3.Dalian Key Laboratory of Digital Technology for National Culture,Dalian 116650,China)
  • Received:2023-12-19 Revised:2024-01-18 Accepted:2024-11-25 Online:2024-11-25 Published:2024-11-27

Abstract: Back propagation neural network (BPNN) is widely used in fields such as behavior recognition and prediction due to its advantages including strong nonlinearity, self-learning capability, adaptability, and robust fault tolerance. With the upgrade and optimization of models and the accelerated growth of data volume, parallel training architectures based on big data distributed computing frameworks have become mainstream. Apache Flink, as a new generation of big data computing frameworks, is widely applied due to its high throughput and low latency characteristics. However, due to the accelerated pace of hardware upgrades and different purchase batches, Flink clusters in real-life scenarios are mostly heterogeneous, meaning that computing resources within the cluster are unbalanced. Existing BPNN parallel training models cannot address the issue of high-performance nodes idling during the training process due to this unbalanced computing resource distribution. Additionally, in a heterogeneous environment, as the number of nodes increases, so does the communication overhead between nodes during BPNN parallel training. The traditional mini-batch gradient descent method possesses precise optimization capabilities, but the combination of random model initialization and precise mini-batch gradient descent characteristics leads to slow convergence speeds in BPNN parallel training. To address the aforementioned issues, this paper aims to accelerate BPNN parallel training speed and improve BPNN parallel training efficiency in a heterogeneous environment by proposing the heterogeneous micro-difference synchronous parallel training (HMDSPT) algorithm. This algorithm scores node performance based on variations in performance within a heterogeneous environment and dynamically allocates data in proportion through a data partitioning module in real-time, ensuring that node performance is directly proportional to the amount of data allocated to each node. This approach reduces the idling time of high- performance nodes.

Key words: Flink, back propagation neural network(BPNN), parallel training, heterogeneous environment