• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (3): 422-433.

• High Performance Computing • Previous Articles     Next Articles

Reinforcement learning control for data center refrigeration systems

WEI Dong 1,2,JIA Yuchen1,HAN Shaoran3   

  1. (1.School of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044;
    2.Key Laboratory of Intelligent Processing for Building Big Data,
    Beijing University of Civil Engineering and Architecture,Beijing 100044;
    3.Beijing Jingcheng Ruida Electric Engineering Technology Co.,Ltd.,Beijing 100176,China)
  • Received:2023-07-13 Revised:2024-03-20 Online:2025-03-25 Published:2025-04-01

Abstract: The refrigeration system in data centers needs to operate continuously throughout the year, and its energy consumption cannot be ignored. Moreover, traditional PID control methods struggle to achieve overall energy savings for the system. To address this, a reinforcement learning control strategy is proposed for data center refrigeration systems, with the control objective of enhancing the overall energy efficiency of the system while meeting cooling requirements. A two-layer hierarchical control structure is designed. The upper optimization layer introduces the multistep prediction-deep deterministic policy gradient (MP-DDPG) algorithm, which leverages DDPG to handle the multi-dimensional continuous action space of the refrigeration system to determine the water valve opening of the air hand- ling unit and the optimal setpoint for each loop in the chilling station system. Multistep prediction is employed to enhance algorithm efficiency and overcome the impact of large system delay during real-time control. The lower field control layer uses PID control to enable the controlled variables to track the optimal setpoints obtained from the optimization layer, achieving performance optimization without disrupting the existing field control system. To address the challenge of real-time control with model-free reinforcement learning, a system prediction model is first constructed, and the reinforcement learning controller is trained offline through interaction with this model. Subsequently, online real-time control is implemented. Experimental results show that compared to the traditional DDPG algorithm, the learning efficiency of the controller is improved by 50%. Compared to PID and MP-DQN (multistep prediction-deep Q network), the systems dynamic performance is improved, and the whole energy efficiency is increased by approximately 30.149% and 11.6%, respectively.

Key words: data center refrigeration system, predictive control, reinforcement learning, depth deterministic strategy gradient method, integrated learning