Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (08): 1366-1375.
Previous Articles Next Articles
YUAN Yuan,LI Shi-jie,XING Jian-ying,JIANG Ju-ping
Received:
Revised:
Accepted:
Online:
Published:
Abstract: The High-Performance Computer (HPC) systems built for future Exascale computing require a several-times increase of assembly density, along with the large expansion of node scale. This presents huge challenges for HPC monitoring subsystem in terms of scalability, reliability, serviceability, and maintenance. In response to these challenges, this paper introduces the design ideas of the monitoring subsystem from the four aspects of architecture, network, functionality, and maintenance, and verifies the feasibility and advantages of some designs through the prototype system, which can significantly benefit the construction of future exascale HPC systems.
Key words: exascale high-performance computer system, monitoring subsystem, scalability, reliability
YUAN Yuan, LI Shi-jie, XING Jian-ying, JIANG Ju-ping. Monitoring subsystem for exascale HPC systems: Challenges and design[J]. Computer Engineering & Science, 2021, 43(08): 1366-1375.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2021/V43/I08/1366