[1] |
Qian Y J, Barton E,Wang T,et al.A novel network request scheduler for a large scale storage system[J].Computer Science-Research and Development,2009,23(3-4):143-148.
|
[2] |
Stefanovici I,Schroeder B,OShea G,et al.sRoute:Treating the storage stack like a network[C]∥Proc of the 14th USENIX Conference on File and Storage Technologies,2016:197-212.
|
[3] |
Patel T,Byna S,Lockwood G K,et al.Revisiting I/O behav- ior in large-scale storage systems:The expected and the unexpected[C]∥Proc of the International Conference for High Performance Computing,Networking,Storage and Analysis,2019:1-13.
|
[4] |
Patel T,Byna S,Lockwood G K,et al.Uncovering access,reuse,and sharing characteristics of I/O-intensive files on large-scale production HPC systems[C]∥ Proc of the 18th USENIX Conference on File and Storage Technologies,2020:91-102.
|
[5] |
Turner A, Sloan-Murphy D,Sivalingam K,et al.Analysis of parallel I/O use on the UK national supercomputing service,ARCHER using Cray LASSi and EPCC SAFE[J].arXiv:1906.03891,2019.
|
[6] |
Brim M J, Lothian J K.Monitoring extreme-scale Lustre toolkit[J].arXiv:1504.06836,2015.
|
[7] |
Behzad B,Byna S,Snir M.Pattern-driven parallel I/O tuning[C]∥Proc of the 10th Parallel Data Storage Workshop,2015:43-48.
|
[8] |
Carns P,Harms K,Allcock W,et al.Understanding and improving computational science storage access through continuous characterization[J].ACM Transactions on Storage,2011,7(3):1-26.
|
[9] |
Xu C,Byna S,Venkatesan V,et al.LIOProf:Exposing Lustre file system behavior for I/O middleware[C]∥Proc of the 2016 Cray User Group Meeting,2016:1-9.
|
[10] |
Patel T,Garg R,Tiwari D.GIFT:A coupon based throttle-and-reward mechanism for fair and efficient I/O bandwidth management on parallel storage systems[C]∥Proc of the 18th USENIX Conference on File and Storage Technologies,2020:103-119.
|
[11] |
Agelastos A,Allan B,Brandt J,et al.The lightweight distributed metric service:A scalable infrastructure for continuous monitoring of large scale computing systems and applications[C]∥Proc of the International Conference for High Performance Computing,Networking,Storage and Analysis,2014:154-165.
|
[12] |
Park B H,Hukerikar S,Adamson R,et al.Big data meets HPC log analytics:Scalable approach to understanding systems at extreme scale[C]∥Proc of the International Confe- rence on Cluster Computing,2017:758-765.
|
[13] |
Huang D, Liu Q, Choi J,et al.Can I/O variability be reduced on QoS-less HPC storage systems?[J].IEEE Transactions on Computers,2018,68(5):631-645.
|
[14] |
Madireddy S, Balaprakash P,Carns P,et al.Analysis and correlation of application I/O performance and system-wide I/O activity[C]∥Proc of the International Conference on Networking,Architecture,and Storage,2017:1-10.
|
[15] |
Gunasekaran R,Oral S,Hill J,et al.Comparative I/O workload characterization of two leadership class storage clusters[C]∥Proc of the 10th Parallel Data Storage Workshop,2015:31-36.
|
[16] |
Morrone C J. Chaos/LMT-Lustre monitoring tool[EB/OL].[2021-05-01]. https:∥github.com/chaos/lmt/wiki.
|
[17] |
Sivalingam K,Richardson H,Tate A,et al.LASSi:Metric based I/O analytics for HPC[C]∥Proc of the High Performance Computer Symposium,2019:1-12.
|
[18] |
Booth S. Analysis and reporting of service data using the SAFE[Z].Cray User Group,2014.
|
[19] |
Eslami H,Kougkas A,Kotsifakou M,et al.Efficient disk-to-disk sorting:A case study in the decoupled execution paradigm[C]∥Proc of the 2015 International Workshop on Data-Intensive Scalable Computing Systems,2015:1-8.
|
[20] |
Cloud Native Computing Foundation Project.Prometheus[EB/OL].[2021-05-01]. https:∥github.com/prometheus/prometheus.
|
[21] |
Qian Y J,Li X,Ihara S,et al.LPCC:Hierarchical persistent client caching for Lustre[C]∥Proc of the International Conference for High Performance Computing,Networking,Storage and Analysis,2019:1-14.
|
[22] |
Cheng W, Li C Y, Zeng L F,et al.NVMM-oriented hierarchical persistent client caching for Lustre[J].ACM Transactions on Storage,2021,17(1):1-22.
|
[23] |
Ihara S.A new quality of service (QoS) policy for Lustre utilizing the Lustre network request scheduler (NRS) framework[C]∥Proc of Lustre Administrator and Developers Workshop (LAD 2013),2013:1.
|
[24] |
Qian Y J,Li X,Ihara S,et al.A configurable rule based classful token bucket filter network request scheduler for the Lustre file system[C]∥Proc of the International Confe- rence for High Performance Computing,Networking,Storage and Analysis,2017:1-12.
|
[25] |
Li X.Lustre QoS solutions based on NRS TBF and client side performance balancing[EB/OL].[2021-10-08].http:∥lustrefs.cn/wp-content/uploads/2018/08/Lustre-QoS-solutions_Lixi.pdf.
|
[26] |
Li X,Zeng L F. LIME:A framework for Lustre global QoS management[C]∥Proc of Lustre Administrator and Developer Workshop,2018:1.
|
[27] |
Cheng W, Deng S J,Zeng L F,et al.AIOC2:A deep Q-learning approach to autonomic I/O congestion control in Lustre[J].Parallel Computing,2021,108:102855.
|
[28] |
SLURM[EB/OL].[2021-08-20].https:∥slurm.schedmd.com/documentation.html.
|
[29] |
FIO.Fio benchmark tool[EB/OL].[2021-08-20].https:∥github.com/axboe/fio.
|