• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (09): 1521-1531.

• High Performance Computing • Previous Articles     Next Articles

Beacon+:A scalable lightweight end-to-end I/O performance monitoring, analysis and diagnosis

YANG Bin1,2,WANG Jing-yu3,LIU Shi-chao1,2,SHAO Ming-shan1,2,XIAO Wei3,Chen Qi3,4,HE Xiao-bin3,LIU Wei-guo1,2,XUE Wei2,4   

  1. Beacon+:A scalable lightweight end-to-end I/O performance monitoring, analysis and diagnosis 
    system for exascale supercomputers
    (1.School of Software,Shandong University,Jinan 250101;
    2.National Supercomputing Center in Wuxi,Wuxi 214072;
    3.National Research Center of Parallel Computer Engineering & Technology,Beijing 100080;
    4.Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
  • Received:2022-01-15 Revised:2022-05-18 Accepted:2022-09-25 Online:2022-09-25 Published:2022-09-25

Abstract: Abstract:With the barrier to exascale computing being broken, high performance computing has entered a new era. In order to meet the growing demand for data access, new technologies and storage media have been used in supercomputers, which makes its architecture increasingly complex and makes it difficult to locate abnormal performance and system hotspots. To this end, a scalable lightweight end-to-end I/O performance monitoring, analysis and diagnosis system for exascale supercomputers, Beacon+, is designed and implemented. It can monitor and analyze the data access process of each application in real-time without modifying the application code/script. Through online+offline compression methods and distributed caching/storage mechanisms, Beacon+ ensures that the system itself is highly scalable and low-cost, and can continuously and stably provide I/O diagnostic services. Using Sunway new-generation supercomputer as the deployment platform, we have proved Beacon+s low overhead, high accuracy and high efficiency of I/O diagnostics through I/O standard test applications and real-world applications.


Key words: