• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2020, Vol. 42 ›› Issue (10高性能专刊): 1765-1773.

• 高性能计算机系统软件 • 上一篇    下一篇

BeeGFS并行文件系统性能优化技术研究

宋振龙,李小芳,李琼,谢徐超,魏登萍,董勇,王睿伯   

  1. (国防科技大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2020-06-12 修回日期:2020-07-14 接受日期:2020-10-25 出版日期:2020-10-25 发布日期:2020-10-23
  • 基金资助:
    国家重点研发计划(2018YFB0204301)

Improving the performance of BeeGFS parallel file system

SONG Zhen-long,LI Xiao-fang,LI Qiong,XIE Xu-chao,WEI Deng-ping,DONG Yong,WANG Rui-bo#br# #br#   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2020-06-12 Revised:2020-07-14 Accepted:2020-10-25 Online:2020-10-25 Published:2020-10-23

摘要: 大数据和人工智能时代,超级计算中心或数据中心的存储需求从PB级向Exabyte级扩展,许多大数据和智能应用程序在高性能计算(HPC)系统上运行,新兴的深度学习应用程序具有批量小文件随机输入特点,使HPC系统的I/O模式更趋复杂,存储管理和I/O瓶颈问题日益突出。并行文件系统是管理超级计算机数据存储的有效手段,但传统并行文件系统主要面向高带宽需求的科学计算任务,难以满足智能应用程序存储需求。针对上述问题,以新兴的BeeGFS文件系统为基础,研究并行文件系统性能优化的关键技术。设计实现了基于键值存储的元数据管理模块以优化元数据IOPS,基于异步I/O和多线程技术的并行I/O处理模型以提升I/O处理并发度,并采用多轨通信机制以提高网络通信带宽。构建了IO500性能评测环境,在相同的配置环境下,I/O带宽和元数据2类基准测试结果表明,改进后的并行文件系统在元数据、数据读写性能上大幅提升,IO500测分是原有系统的2倍以上。


关键词: 高性能计算, 并行文件系统, BeeGFS, IO500

Abstract: As we embark on a new era of big data and Artificial Intelligence (AI), supercomputing centers and data centers raise an ever-increasing demand for high-performance storage systems from petabyte-scale to exabyte-scale. In recent years, High-Performance Computing (HPC) systems have been widely used for big data and AI applications. The I/O patterns of new emerging AI applications show a characteristic of small batch-processing file accesses, which makes HPC storage system designs increasingly complicated. Parallel File System (PFS) primarily designed for bandwidth-oriented applications is one of the most effective ways to manage data for HPC systems. However, existing PFSs are not capable of providing high performance for AI applications. This paper focuses on investigating and improving the system performance of BeeGFS, which is a new emerging PFS for HPC systems. We propose a Key-Value (KV)-based metadata management module to improve IOPS of metadata accesses, introduce asynchronous I/O and multi-threading technologies into parallel I/O processing module to improve I/O processing concurrency, and employ multi-track communication mechanism to increase networking bandwidth. Our experimental results show that the modified BeeGFS can significantly improve the performance of both metadata and data accesses, and achieve as high as 2 times scores than the original BeeGFS under the IO500 benchmark. 



Key words: high performance computing (HPC), parallel file system (PFS), BeeGFS, IO500