• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (08): 1331-1341.

• High Performance Computing • Previous Articles     Next Articles

A cluster job execution time prediction model based on LSTM

ZHU Zheng-dong1,WU Yin-chao2,HU Ya-hong2,JIANG Jia-qiang1   

  1. (1.School of Computer Science and Technology,Xi’an Jiaotong University,Xi’an 710049;
    2.College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China)
  • Received:2021-07-16 Revised:2021-11-11 Accepted:2022-08-25 Online:2022-08-25 Published:2022-08-25

Abstract: To improve the quality of service (QoS), data centers need to ensure that user jobs can be completed within a specified deadline, so jobs must be efficiently scheduled based on real-time system resources. A job scheduling algorithm based on a LSTM (Long Short-Term Memory)-based job execution time prediction model is proposed to minimize the job completion time. The LSTM-based time prediction model predicts the execution time of user jobs according to the type of user jobs, the amount of jobs, the number of CPU cores and memory required by the jobs, and the ratio of the resources required by the jobs to the total system resources. The prediction results are used to judge whether the cluster is capable of completing user jobs on time, and provide a basis for rationally arranging the execution order of the jobs. The hyperparameters that affect the performance of the LSTM time prediction model, such as the number of iterations, the learning rate and the number of network layers, are determined through experiments. Experiments show that compared with the SVR model, ARIMA model and BP model, the job execution time prediction model based on LSTM improves the determination coefficient R2 by 297%, 2.34% and 5.66% respectively, and the average error of its prediction is only 0.78%. 

Key words: long short-term memory(LSTM), time prediction, job scheduling, quality of service(QoS)