• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2020, Vol. 42 ›› Issue (10高性能专刊): 1749-1756.

• 高性能计算机体系结构 • 上一篇    下一篇

支持多优先级多输出通道的数据队列调度方法和硬件实现

徐金波,常俊胜,李琰   

  1. (国防科技大学计算机学院,湖南 长沙 410073)

  • 收稿日期:2020-06-10 修回日期:2020-07-23 接受日期:2020-10-25 出版日期:2020-10-25 发布日期:2020-10-23
  • 基金资助:
    国家重点研发计划(2018YFB0204300)

A data queue scheduling method supporting multi-priority and multi-output channels and its hardware implementation 

XU Jin-bo,CHANG Jun-sheng,LI Yan   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China) 
  • Received:2020-06-10 Revised:2020-07-23 Accepted:2020-10-25 Online:2020-10-25 Published:2020-10-23

摘要: 提出了一种支持多优先级多输出通道的数据队列调度方法,以满足专用集成电路ASIC芯片内多组输入数据对多个输出通道资源进行请求的需求。首先,所提出的方法适用范围广,既可以通过随机模式(Random)进行调度以达到负载均衡,也可以通过配置不同优先级以区分服务质量QoS。对于随机模式,处于空闲状态的多个输出通道轮流接收输入数据;对于区分服务质量模式,所有输入源、输出通道被划分为不同优先级,使某组输出通道只接收对应优先级的输入源的数据。其次,该方法具有硬件实现代价低的优势,这得益于多个输出通道共享同一个仲裁器对输入数据进行仲裁。基于该方法,在天河超级计算机系统的网络接口芯片中对软硬件接口数据队列的调度进行了优化,并在验证环境中进行了测试。当前测试结果显示,所提出的方法与传统的单输出队列调度器相比,仅增加3‰~2%的调度时间代价,仅增加1.5%左右的硬件资源代价,但在处理直接内存读取事务时却实现了2倍左右的速度提升;同时,配置为QoS模式时高优先级线程的执行时间仅为低优先级执行时间的1/3左右,且是灵活可配置的。

关键词: 队列调度, 优先级, 服务质量, 仲裁器

Abstract: To meet the requirements of mapping multiple input data to multiple output channels in ASIC (application specific integrated circuit) design, this paper proposes a data queue scheduling method that supports multi priority and multi output channels. Firstly, the proposed method can be used in a wide range of applications, either to achieve load balancing by scheduling in a random mode or to diffe- rentiate the quality of service (QoS) by configuring different priorities. In the random mode, multiple output channels in idle state will receive input data in round-robin fashion. In the QoS mode, all input sources and output channels are divided into different priorities, so that a certain output channel only receives data from input sources with the corresponding priority. Secondly, this method has the advantage of low hardware implementation cost, due to multiple output channels sharing a single arbiter instead of multiple individual arbiters. The proposed method is applied in the network interface chip design of Tianhe supercomputer system to optimize the data queue scheduling for the software/hardware interface. The design is tested in the verification environment. Current test results show that, compared with the traditional single-output queue scheduler, the proposed method only increases the scheduling time cost by 3‰ to 2% and the hardware resource cost by about 1.5%, but achieves a two-fold increase in the speed of processing for direct memory read transactions. At the same time, when it is configured for QoS mode, the execution time ratio between high-priority threads and low-priority threads is about 1∶3, and it is also flexible for reconfiguration.



Key words: queue scheduling, priority, quality of service (QoS), arbiter