• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (8): 184-190.

• 论文 • 上一篇    下一篇

基于动态连接的RDMA可靠传输协议设计

刘路,张磊,曹继军,戴艺   

  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2012-04-28 修回日期:2012-06-11 出版日期:2012-08-25 发布日期:2012-08-25
  • 基金资助:

    国家863计划资助项目(2012AA01A301);国家自然科学基金资助项目(61003301)

Design of the RDMA Reliable Communication Protocol Based on Dynamic Connection

LIU Lu,ZHANG Lei,CAO Jijun,DAI Yi   

  1. (School of Computer Science,National University of Defense Technology,Changsha 410073,China)
  • Received:2012-04-28 Revised:2012-06-11 Online:2012-08-25 Published:2012-08-25

摘要:

未来100P/E级高性能计算机系统对网络的传输可靠性、性能均衡性、可扩展性方面有更高的需求。本文提出的RDMA传输模型,采取配置少量资源,动态连接使用的策略实现端到端的数据可靠传输。与传统的可靠通信协议如Infiniband相比,本方案的优势为:(1)支持自动重路由,可绕过网络故障区域保证消息的可靠传输;(2)支持报文乱序到达,支持源和目的间的多路径传输,提供消息的流控机制,能较好地均衡网络整体性能,减少网络热点和缓解网络拥塞;(3)基于通信接口硬件实现可靠性数据结构,不需要消耗主存为通信建立连接,具有极高的系统可扩展性。初步测试结果表明,采取了优化措施后,该协议不会增加小于4K字节消息的传输延迟。

关键词: 可靠传输协议, RDMA, 网络接口, Infiniband, 动态连接

Abstract:

Upcoming 100 Petascale/Exascale Supercomputers will demand highly reliable,wellbalanced and highly scalable interconnection networks.Our RDMA transport model implements an endtoend reliable communication protocol by a small quantity of resources configuration and the dynamic connection strategy.Unlike the conventional implementations such as Infiniband,the proposed scheme has superior attributes in terms of a) being able to recover network failures by changing route automatically;b)being able to handle the packets coming out of order and use multiple paths between the source and destination nodes,providing message flow control,all of these measures can reduce the network hot spot and congestion;c)the reliability resources are implemented in hardware, not consuming the memory for connection,so it has good system scalability.The experimental results show that our optimized scheme does not increase the latency of the messages whose size is below 4k bytes.

Key words: reliable communication protocol;RDMA;network interface;Infiniband;dynamic connection