• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (07): 1344-1349.

• 论文 • 上一篇    下一篇

基于多核NPU的TCP数据接收卸载

李杰,陈曙晖   

  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2015-06-25 修回日期:2015-08-11 出版日期:2016-07-25 发布日期:2016-07-25
  • 基金资助:

    国家自然科学基金(61379148)

Multicore NPU based TCP large receive offload    

LI Jie,CHEN Shuhui   

  1. (College of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2015-06-25 Revised:2015-08-11 Online:2016-07-25 Published:2016-07-25

摘要:

目前以太网的发展速度远高于存储器和CPU的发展速度,存储器访问和CPU处理网络协议已经成为TCP的性能瓶颈。网络带宽的不断增大对CPU造成了沉重的负担,大约需要1 GHz的CPU处理资源对1 Gbps的网络流量进行协议处理。为此,使用多核NPU作为NIC,实现TCP接收数据路径中的校验和计算、报文乱序重组功能,并将合并之后的大报文经Linux网卡驱动程序交由协议栈处理,从而减少协议栈处理报文和网卡产生中断的数量,提升端系统的TCP性能。在10 Gbps以太网络中,实验取得4.9 Gbps的TCP接收数据吞吐量。

关键词: TCP乱序重组, TCP数据接收卸载, LRO, TOE, 多核NPU

Abstract:

The current development of the Ethernet technology is much faster than that of memory and CPU technologies, and the memory access and CPU processing network stack have become the bottleneck of TCP performance on end systems. The constantly increasing network bandwidth burdens  the CPU severely, and approximately 1GHz CPU resource is needed to process 1Gbps network traffic. We therefore take a multicore NPU as the NIC and the TCP's checksum verification and packets reordering functions are offloaded. Small TCP packets aggregated into fewer but larger packets by the multicore NPU, thus reducing both the number of packets processed by network stack and the number of interrupts generated by the NIC, and eventually improving the TCP performance on end systems. Experimental results show that 4.9 Gbps TCP receive data throughput can be achieved in a 10Gbps network.

Key words: TCP packets reordering;TCP data receive offload;LRO;TOE;multicore NPU