• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (09): 1532-1541.

• 高性能计算 • 上一篇    下一篇

SlurmX:基于Slurm使用面向对象设计方法重构的任务调度系统

温瑞林1,樊春2,3,4,马银萍2 ,王政丹5,向广宇5 ,付振新2   

  1. (1.北京大学信息科学技术学院,北京 100871;2.北京大学计算中心,北京 100871;
    3.北京大学国家生物医学成像科学中心,北京 100871;4.鹏城实验室,广东 深圳 518055;
    5.北京大学软件与微电子学院,北京 102600)

  • 收稿日期:2022-03-14 修回日期:2022-05-09 接受日期:2022-09-25 出版日期:2022-09-25 发布日期:2022-09-25

SlurmX:A task scheduling system refactored from Slurm using object oriented methodology

WEN Rui-lin1,FAN Chun2,3,4,MA Yin-ping2,WANG Zheng-dan5,XIANG Guang-yu5,FU Zhen-xin2   


  1. (1.School of Electronics Engineering and Computer Science,Peking University,Beijing 100871;
    2.Computer Center,Peking University,Beijing 100871;
    3.National Biomedical Imaging Center,Peking University,Beijing 100871;
    4.Peng Cheng Laboratory,Shenzhen 518055;
    5.School of Software and Microelectronics,Peking University,Beijing 102600,China)
  • Received:2022-03-14 Revised:2022-05-09 Accepted:2022-09-25 Online:2022-09-25 Published:2022-09-25

摘要: 目前使用较为广泛的Slurm任务调度系统存在代码臃肿、新功能开发效率低和难以维护的问题,在参考目前较为成熟的任务调度系统(如Slurm和HTCondor)的优缺点的基础上,设计了一个性能优异、可扩展性好和维护方便的高性能任务及资源调度系统SlurmX。讨论了通过使用面向对象的手段,对Slurm内部的组件从上至下进行了功能级别的重新抽象和组织,并从系统架构设计和组件内部设计等方面,简述了该调度系统如何在保障性能的情况下,提供高可扩展性和内部模块之间的低耦合性。

关键词: 任务调度系统, 面向对象方法, Slurm, cgroups

Abstract: At present, the widely used Slurm task scheduling system has the problems of bloated code, inefficient development of new functions and difficult maintenance. Based on the advantages and disadvantages of various currently mature task scheduling systems (such as Slurm and HTCondor), this paper designs a high-performance task and resource scheduling system SlurmX with excellent performance, excellent scalability and easy maintenance. This paper uses object-oriented methodology is used to refractor and reorganize the internal components of Slurm from top to bottom at functional levels, and discusses how to provides the high scalability of this system, and the low coupling between internal modules while ensuring the performance from the aspect of system architecture design and internal component design.

Key words: task scheduling system, object-oriented methodology, Slurm, cgroups