• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (3): 392-399.

• High Performance Computing • Previous Articles     Next Articles

Parallel file system network driver based on Tianhe inter-connection system

DONG Yong,WU Huijun,YANG Lihua,ZHANG Wei,WANG Ruibo,ZHOU Enqiang   

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2023-09-12 Revised:2023-10-19 Online:2025-03-25 Published:2025-04-01

Abstract: Parallel file system is an essential component of the software stack in high performance computing systems. The driver designed for high-speed networks is a crucial aspect of parallel file systems in providing efficient data access. A parallel file network driver based on the Tianhe high-speed interconnect network (TH-Express), named GLND, has been designed and implemented. GLND has been optimized specifically in three areas: parallelization, communication protocol, and fault tolerance. It achieves high throughput through VP-level parallelism combined with appropriately balanced pipeline partitioning. It adaptively selects the underlying communication protocol based on factors such as message size differences, implementing a NUMA-aware memory management mechanism. Additionally, an adaptively adjustable timeout mechanism is employed to avoid the impact of abnormal timeouts at the software layer on the completion of communication operations. Experimental results show that under the same hardware conditions, GLND improves write bandwidth by an average of 23.69% and read bandwidth by an average of 79.25% compared to TCP.

Key words: parallel file system, interconnect, network programming interface