• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (04): 571-583.

• 高性能计算 • 上一篇    下一篇

分布式存储系统读写一致性算法性能优化研究综述

沈佳杰1,卢修文1,2,3,向望1,赵泽宇1,王新1,2,3   

  1. (1.复旦大学校园信息化办公室,上海 200433;2.复旦大学计算机科学技术学院,上海 200433;
    3.复旦大学上海市智能信息处理重点实验室,上海 200433)
  • 收稿日期:2021-10-08 修回日期:2021-12-03 接受日期:2022-04-25 出版日期:2022-04-25 发布日期:2022-04-20
  • 基金资助:
    国家自然科学基金(61971145);中国高等教育学会重大项目(2020XXHZ01)

consensus algorithms for distributed storage system

SHEN Jia-jie1,LU Xiu-wen1,2,3,XIANG Wang1,ZHAO Ze-yu1,WANG Xin1,2,3   

  1. (1.Informatization Office,Fudan University,Shanghai 200433;
    2.School of Computer Science,Fudan University,Shanghai 200433;
    3.Shanghai Key Laboratory of Intelligent Information Processing,Fudan University,Shanghai 200433,China)
  • Received:2021-10-08 Revised:2021-12-03 Accepted:2022-04-25 Online:2022-04-25 Published:2022-04-20

摘要: 读写一致性算法被广泛部署到分布式存储系统,以保证读写数据的正确性。然而,读写一致性算法通常需要使用一个复杂的通信协议来保证多个节点读写数据的正确性,会带来较大网络传输开销和读写时延。由于各种读写一致性算法实现机制存在较大差异,特定的读写一致性算法往往需要部署到特定的存储应用场景,才能高效地执行数据读写操作,保障对其上应用的服务质量。因此,实际的存储系统开发过程中,开发人员往往需要根据存储应用场景选择读写一致性算法,从而减少数据读写操作带来的系统开销。为了明确各种读写一致性算法适合的应用场景,介绍了分布式存储系统中存在的读写一致性问题,并综述了当前读写一致性算法的实现机制。总结了在副本和纠删码2种存储机制下主流的读写一致性算法,比较了这些读写一致性算法在实现机制、网络开销和数据存储开销等方面的特性。在此基础上,结合了单数据中心分布式存储系统和跨数据中心云际存储系统2种经典的应用场景,总结了开发人员在实际存储系统中部署读写一致性算法过程中需要注意的要点,分析了亟需解决的问题和提升数据读写操作性能的可能途径,展望了读写一致性算法未来的发展方向。

关键词: 读写一致性算法, 分布式存储系统, 纠删码存储系统, 数据读写操作, 性能优化

Abstract: Consensus algorithms are widely adopted in the distributed storage systems to ensure the correctness of the I/O operations. Since consensus algorithms typically use a complex protocol to ensure the correctness of the I/O operations between multiple storage nodes, it incurs high network transmission overhead and I/O delay. Due to the large differences in the implementation mechanisms of various consensus algorithms, specific consensus algorithms often need to be deployed in specific storage application scenarios in order to efficiently perform I/O operations and ensure the quality of service of the applications on them. Therefore, in the actual storage system development process, developers often need to select consensus algorithms according to storage application scenarios, thereby reducing the system overhead caused by I/O operations. In order to clarify the suitable application scenarios of various consensus algorithms, this paper introduces the consensus problems existing in distributed storage systems, and summarizes the implementation mechanism of current consensus algorithms. This paper summarizes the mainstream consensus algorithms under the replica-based storage systems and the erasure-coded storage systems, and compares the characteristics of these consensus algorithms in terms of implementation mechanism, network overhead, and data storage overhead. On this basis, this paper combines two classic application scenarios of a single data center distributed storage system and a cross-data center cloud-to-cloud storage system, and summarizes the main points that developers need to pay attention to when deploying consensus algorithms in actual storage systems. The problems that need to be solved urgently and the possible ways to improve the performance of I/O operations are analyzed, and the future development direction of the consensus algorithm is prospected. 

Key words: consensus algorithm, distributed storage system, erasure-coded storage system, I/O operation, performance optimization