• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2020, Vol. 42 ›› Issue (10高性能专刊): 1765-1773.

Previous Articles     Next Articles

Improving the performance of BeeGFS parallel file system

SONG Zhen-long,LI Xiao-fang,LI Qiong,XIE Xu-chao,WEI Deng-ping,DONG Yong,WANG Rui-bo#br# #br#   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2020-06-12 Revised:2020-07-14 Accepted:2020-10-25 Online:2020-10-25 Published:2020-10-23

Abstract: As we embark on a new era of big data and Artificial Intelligence (AI), supercomputing centers and data centers raise an ever-increasing demand for high-performance storage systems from petabyte-scale to exabyte-scale. In recent years, High-Performance Computing (HPC) systems have been widely used for big data and AI applications. The I/O patterns of new emerging AI applications show a characteristic of small batch-processing file accesses, which makes HPC storage system designs increasingly complicated. Parallel File System (PFS) primarily designed for bandwidth-oriented applications is one of the most effective ways to manage data for HPC systems. However, existing PFSs are not capable of providing high performance for AI applications. This paper focuses on investigating and improving the system performance of BeeGFS, which is a new emerging PFS for HPC systems. We propose a Key-Value (KV)-based metadata management module to improve IOPS of metadata accesses, introduce asynchronous I/O and multi-threading technologies into parallel I/O processing module to improve I/O processing concurrency, and employ multi-track communication mechanism to increase networking bandwidth. Our experimental results show that the modified BeeGFS can significantly improve the performance of both metadata and data accesses, and achieve as high as 2 times scores than the original BeeGFS under the IO500 benchmark. 



Key words: high performance computing (HPC), parallel file system (PFS), BeeGFS, IO500