• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (11): 182-186.

• 论文 • 上一篇    下一篇

一种并行作业任务启动模型及其可扩展性分析

宋长明,龚道永,张宏宇   

  1. (江南计算技术研究所,江苏 无锡 214000)
  • 收稿日期:2013-08-24 修回日期:2013-10-05 出版日期:2013-11-25 发布日期:2013-11-25

A parallel job launch model and its scalability analysis            

SONG Chang-ming,GONG Dao-yong,ZHANG Hong-yu   

  1. (Jiangnan Institute of Computing Technology,Wuxi 214000,China)
  • Received:2013-08-24 Revised:2013-10-05 Online:2013-11-25 Published:2013-11-25

摘要:

随着高性能计算机系统规模的不断扩大,作业启动的时间越来越长,大作业的启动时间逐渐成为影响系统规模扩展的一个重要因素。同时,元器件数目快速增长带来的更频繁的故障也使大规模并行应用在完成前可能经历多次反复提交,因此作业任务的启动效率也直接影响着系统计算资源的有效利用率和用户使用体验。通过设计一种层次式并行作业任务启动模型,并对其在不同作业规模下的性能进行测试、分析与优化,经过优化后该模型能够支持一个大规模系统的作业任务启动与控制,并具备较好的可扩展性。

关键词: 作业任务启动, 层次式管理, 虚拟化, 网络数据优化, 扩展性

Abstract:

With the expansion of the scale of high performance computer systems, job launch consumes more and more time, and the task start time gradually becomes an important factor affecting system scalability. Meanwhile, the rapid increase of the number of components brings failures more frequently, resulting in repeated submission of parallel applications before their completion. Therefore, the task start efficiency has a direct impact on the effective utilization of computing resources and user experience. A hierarchical parallel job launch model is designed and its performance under different job scales is tested, analyzed and optimized. After optimization, the proposed model can support a large-scale system with tasks start and control efficiently and have good scalability.

Key words: job task launch;hierarchical management;virtualization;network data optimization;scalability