• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2026, Vol. 48 ›› Issue (2): 228-237.

• 高性能计算 • 上一篇    下一篇

下一代智算中心RDMA QP通信机制

王军良,林宝洪,张娇,孙梦宇,潘永琛   

  1. (1.中国电信股份有限公司广东研究院,广东  广州  510660;
    2.北京邮电大学网络与交换技术国家重点实验室,北京 100876;3.中国电信股份有限公司北京研究院,北京 100045)

  • 收稿日期:2024-03-07 修回日期:2024-10-23 出版日期:2026-02-25 发布日期:2026-03-10
  • 基金资助:
    国家重点研发计划(2022YFB4501405)

An RDMA QP communication mechanism of next-generation intelligent computing center

WANG Junliang,LIN Baohong,ZHANG Jiao,SUN Mengyu,PAN Yongchen   

  1. (1.China Telecom Guangdong Research Institute,Guangzhou 510660;
    2.State Key Laboratory of Networking and Switching Technology,
    Beijing University of Posts and Telecommunications,Beijing 100876;
    3.China Telecom Beijing Research Institute,Beijing 100045,China)
  • Received:2024-03-07 Revised:2024-10-23 Online:2026-02-25 Published:2026-03-10

摘要: 当前智算中心主要采用远程直接存取RDMA协议实现集群内部的超高性能通信,每对进程之间都需要建立基于可靠连接RC类型的队列对QP。在下一代大规模智算中心的AI大模型场景下,All-to-All和All Reduce这些分布式的集合通信操作会触发进程与进程间的全连接通信,基于RC的机制所需要维护的QP数量将突破百万,对RDMA网卡中有限的内存和性能带来极大挑战。为解决该问题,提出了高效可靠数据报ERD的RDMA QP通信机制,一方面通过可靠数据报RD来代替传统的RC,提高网卡的QP可扩展性;另一方面设计基于RD的可靠接收机制,在网络栈增加数据包丢包和快速有序处理,保证网络可靠性的同时提高传输性能。经过实验以及NS3仿真测试,ERD可以降低99.96%的QP数量,同时网络拥塞时传输性能可以提升15%以上。

关键词: 智算中心网络, AI大模型通信, 远程直接存取协议, QP通信

Abstract: Currently, intelligent computing centers primarily employ RDMA (remote direct memory access)protocol to achieve ultra-high-performance communication within clusters, where each pair of processes needs to establish a queue pair (QP) based on the reliable connection (RC) type. In the context of AI  large model scenarios in next-generation large-scale intelligent computing centers, distributed collective communication operations such as All-to-All and All Reduce will trigger fully connected communication between processes. The number of QPs that need to be maintained under the RC-based mechanism will exceed one million, posing significant challenges to the limited memory and performance of RDMA network interface cards (NICs). To address this issue, an RDMA QP communication mechanism named ERD (efficient reliable datagram) is proposed. On one hand, it replaces traditional RC with RD (reliable datagram) to enhance the scalability of QPs on NICs; on the other hand, it designs an RD-based reliable reception mechanism that incorporates packet loss handling and rapid ordered processing in the network stack, ensuring network reliability while improving transmission performance. Through experiments and NS3 simulation tests, ERD can reduce the number of QPs by 99.96% and enhance transmission performance by over 15% during network congestion.

Key words: intelligent computing center network, AI large-model communication, RDMA, QP communication