A regional shared and high concurrent storage architecture based on NVMeoF storage pool

Abstract

Abstract:

In the era of exascale computing and big data, High Performance Computing (HPC) systems have been widely deployed as the infrastructure for big data analytics, in order to leverage their parallel computing capabilities. As the I/O patterns in HPC systems get increasingly complicated and heterogeneous, breaking through the I/O bottleneck is challenging and urgent for HPC systems. In recent years, flash-based storage arrays and storage servers have been gradually deployed in HPC storage systems. However, the conventional shared storage architectures, I/O software stack, and storage networking designs are primarily for Hard Disk Drives (HDD), which induces severe I/O overhead in the I/O path and prevents the HPC storage systems from taking full advantage of the performance benefits from Non-Volatile Memory (NVM). To achieve low I/O latency, high concurrent I/O throughput, and high burst I/O bandwidth, this paper proposes a regional shared and high concurrent storage architecture. We design an NVMeoF-based burst I/O storage pool (NV-BSP), which implements the key techniques such as virtualized storage pool resource management and NVeoF network storage communication based on Tianhe high-speed Internet. It has horizontal and vertical expansion capabilities and can effectively support Burst I/O acceleration and low-latency remote for specific computing tasks. Besides, we further propose a Quality-of-Service (QoS) control strategy for the storage systems with HPC and big data mixed applications. The experimental results on a prototype system show that NV-BSP achieves the scalable write performance as the number of I/O handling threads increases. Compared with the built-in MD-RAID in Linux, NV-BSP obtains higher I/O bandwidth. Compared with the node-local storage pool, I/O latencies of NVMeoF-based remote storage only increase 59.25us for read and 54.03us for write. By disaggregating storage from computation, NV-BSP significantly improves the system scalability and reliability while delivering the comparable performance to local storage.

Key words: storage architecture, burst buffer, NVMe SSD, NVMe over fabrics, high performance computing, big data

LI Qiong, SONG Zhen-long, YUAN Yuan, XIE Xu-chao. A regional shared and high concurrent storage architecture based on NVMeoF storage pool[J]. Computer Engineering & Science, 2020, 42(10高性能专刊): 1711-1719.

[1]	ZHANG Jianmin, XU Weikang, LIU Jinjin, LI Tiejun. Research advances in acceleration methods for particle transport non-deterministic simulation [J]. Computer Engineering & Science, 2025, 47(01): 1-9.
[2]	SUN Yan, ZHANG Jian-min, LI Yuan, SUN Shun-yu. Analysis and evaluation of congestion control in interconnection networks for high performance computing [J]. Computer Engineering & Science, 2024, 46(02): 209-216.
[3]	SHI De-jun, LI Hong-liang, HU Shu-kai . A Clos network based high-radix router structure [J]. Computer Engineering & Science, 2023, 45(12): 2099-2112.
[4]	ZHANG Tian-yang, CHI Cheng-yue, GUO Wu, GAO Yi-qin, WEN Min-hua, WEI Jian-wen . Key techniques and practice on managing multi-site HPC clusters for university campus [J]. Computer Engineering & Science, 2023, 45(12): 2135-2145.
[5]	XIAO Tiao-jie, ZHOU Feng, ZHENG Xuan-yu, LIU Jian, CHEN Lin, LIU Jie, YI Ming-kuan, CHEN Xu-guang, GONG Chun-ye, YANG Bo, GAN Xin-biao, LI Sheng-guo, ZUO Ke, . Large-scale 3D electromagnetic modeling in frequency domain using integration equation method [J]. Computer Engineering & Science, 2023, 45(11): 1901-1910.
[6]	ZHU Wen-long, JIANG Jia-zhi, HUANG Dan, XIAO Nong. ParM: A heterogeneous programming model for domestic processors [J]. Computer Engineering & Science, 2023, 45(09): 1521-1531.
[7]	WU Tie-bin, GUO Feng, WANG Di. A survey of core computing architecture of high performance processors for exascale computing [J]. Computer Engineering & Science, 2023, 45(05): 761-771.
[8]	CHEN Feng-xian. Cluster job runtime prediction based on NR-Transformer [J]. Computer Engineering & Science, 2022, 44(07): 1181-1190.
[9]	WU Jun-nan, OU Yang, LI Yan. Design and implementation of a high performance computing user organization management system based on LAMP#br# #br# [J]. Computer Engineering & Science, 2021, 43(02): 235-241.
[10]	LIU Jie, GONG Chun-ye, YANG Bo, GUO Xiao-wei, GAN Xin-biao, LI Sheng-guo, LI Chao, CHEN Xu-guang, XIAO Tiao-jie, MU Li-an, SONG Min, ZHAO Dong-yong, JU Yu-zhong. YH-ACT：Parallel analysis code of thermohydraulics [J]. Computer Engineering & Science, 2021, 43(01): 58-69.
[11]	LI Zhe, TAN Yusong, LI Bao, YU Jie. Cold start optimization on function computing for high performance computing [J]. Computer Engineering & Science, 2020, 42(11): 1973-1980.
[12]	SONG Zhen-long, LI Xiao-fang, LI Qiong, XIE Xu-chao, WEI Deng-ping, DONG Yong, WANG Rui-bo. Improving the performance of BeeGFS parallel file system [J]. Computer Engineering & Science, 2020, 42(10高性能专刊): 1765-1773.
[13]	FENG Feng, ZHOU Qing-lei, LI Bin. HMAC-SHA1 password recovery based on multi-core FPGA [J]. Computer Engineering & Science, 2020, 42(10高性能专刊): 1859-1868.
[14]	GAO Xiang, ZHANG Xiang, XU Chuan-fu, LIU Jie, GONG Chun-ye. Research on general mesh generation software for scientific engineering computing [J]. Computer Engineering & Science, 2020, 42(10高性能专刊): 1897-1904.
[15]	ZHENG Wen-xu,PAN Xiao-dong,MA Di,WANG Hao. Overview on the energy efficiency of job scheduling for high performance computing [J]. Computer Engineering & Science, 2019, 41(09): 1526-1533.

A regional shared and high concurrent storage architecture based on NVMeoF storage pool

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments