• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (08): 1381-1389.

• High Performance Computing • Previous Articles     Next Articles

A distributed metadata load balancing algorithm based on dynamic space partitioning and compressed Bloom filter

XUE Mei-ting1,YU Wan-gang2,ZHANG Ji-lin1,ZENG Yan2,YUAN Jun-feng2,ZHOU Li2   

  1. (1.School of Cyberspace Security,Hangzhou Dianzi University,Hangzhou 310018;
    2.School of Computer Science,Hangzhou Dianzi University,Hangzhou 310018,China)
  • Received:2023-11-03 Revised:2023-12-29 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

Abstract: The distributed metadata management system utilizes multiple metadata servers (MDS) to store and manage a large amount of metadata. The system reduces the data load on each individual MDS by employing different mapping strategies to distribute the massive metadata across multiple MDS, thus minimizing the disk access frequency and improving the overall performance of the metadata management system. Typically, a hash function is used to map metadata keys to different MDS. However, when the feature values of the data are similar, the one-way nature of the hash function can result in data distribution imbalance, leading to performance degradation of the MDS. To address the performance degradation issue caused by uneven data distribution, this paper proposes a dynamic spatial partitioning and compressed Bloom filter-based metadata load balancing algorithm. The algorithm first constructs a hash bucket to organize the metadata keys, mapping the keys to different hash buckets using a hash algorithm. During the mapping process, the target hash bucket is dynamically adjusted based on the load condition of the MDS, and the mapping information of the metadata keys is sequentially stored within the corresponding hash bucket. When accessing metadata, the algorithm preprocesses the metadata keys using a compressed Bloom filter, and then performs a binary search within the specified hash bucket to retrieve the mapping information. Compared to recent metadata management algorithms, the proposed algorithm ensures load balancing of MDS even when key skewness occurs. Experimental results show that the algorithm achieves a 20% improvement in search performance compared to the optimal metadata management algorithm, with only a 2% increase in memory consumption.

Key words: distributed metadata management, load balancing algorithm, consistent hashing, compressed Bloom filter ,