• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2020, Vol. 42 ›› Issue (10高性能专刊): 1711-1719.

Previous Articles     Next Articles

A regional shared and high concurrent storage architecture based on NVMeoF storage pool

LI Qiong,SONG Zhen-long,YUAN Yuan,XIE Xu-chao   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2020-06-11 Revised:2020-07-12 Accepted:2020-10-25 Online:2020-10-25 Published:2020-10-23

Abstract:

In the era of exascale computing and big data, High Performance Computing (HPC) systems have been widely deployed as the infrastructure for big data analytics, in order to leverage their parallel computing capabilities. As the I/O patterns in HPC systems get increasingly complicated and heterogeneous, breaking through the I/O bottleneck is challenging and urgent for HPC systems. In recent years, flash-based storage arrays and storage servers have been gradually deployed in HPC storage systems. However, the conventional shared storage architectures, I/O software stack, and storage networking designs are primarily for Hard Disk Drives (HDD), which induces severe I/O overhead in the I/O path and prevents the HPC storage systems from taking full advantage of the performance benefits from Non-Volatile Memory (NVM). To achieve low I/O latency, high concurrent I/O throughput, and high burst I/O bandwidth, this paper proposes a regional shared and high concurrent storage architecture. We design an NVMeoF-based burst I/O storage pool (NV-BSP), which implements the key techniques such as virtualized storage pool resource management and NVeoF network storage communication based on Tianhe high-speed Internet. It has horizontal and vertical expansion capabilities and can effectively support Burst I/O acceleration and low-latency remote for specific computing tasks. Besides, we further propose a Quality-of-Service (QoS) control strategy for the storage systems with HPC and big data mixed applications. The experimental results on a prototype system show that NV-BSP achieves the scalable write performance as the number of I/O handling threads increases. Compared with the built-in MD-RAID in Linux, NV-BSP obtains higher I/O bandwidth. Compared with the node-local storage pool, I/O latencies of NVMeoF-based remote storage only increase 59.25us for read and 54.03us for write. By disaggregating storage from computation, NV-BSP significantly improves the system scalability and reliability while delivering the comparable performance to local storage.


Key words: storage architecture, burst buffer, NVMe SSD, NVMe over fabrics, high performance computing, big data