GNNSched: A GNN inference task scheduling framework on GPU

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (1): 1-11.

• High Performance Computing • Previous Articles Next Articles

GNNSched: A GNN inference task scheduling framework on GPU

SUN Qing-xiao,LIU Yi,YANG Hai-long,WANG Yi-qing,JIA Jie,LUAN Zhong-zhi,QIAN De-pei

(School of Computer Science and Engineering,Beihang University,Beijing 100191，China)

Received:2022-12-28 Revised:2023-03-04 Online:2024-01-25 Published:2024-01-15

Abstract

Abstract: Due to frequent memory access, graph neural network (GNN) often has low resource util- ization when running on GPU. Existing inference frameworks, which do not consider the irregularity of GNN input, may exceed GPU memory capacity when directly applied to GNN inference tasks. For GNN inference tasks, it is necessary to pre-analyze the memory occupation of concurrent tasks based on their input characteristics to ensure successful co-location of concurrent tasks on GPU. In addition, inference tasks submitted in multi-tenant scenarios urgently need flexible scheduling strategies to meet the quality of service requirements for con-current inference tasks. To solve these problems, this paper proposes GNNSched, which efficiently manages the co-location of GNN inference tasks on GPU. Specifically, GNNSched organizes concurrent inference tasks into a queue and estimates the memory occupation of each task based on a cost function at the operator level. GNNSched implements multiple scheduling strategies to generate task groups, which are iteratively submitted to GPU for concurrent execution. Experimental results show that GNNSched can meet the quality of service requirements for concurrent GNN inference tasks and reduce the response time of inference tasks.

Key words: graph neural network (GNN), graphic processing unit (GPU), inference framework, task scheduling, estimation model

SUN Qing-xiao, LIU Yi, YANG Hai-long, WANG Yi-qing, JIA Jie, LUAN Zhong-zhi, QIAN De-pei. GNNSched: A GNN inference task scheduling framework on GPU[J]. Computer Engineering & Science, 2024, 46(1): 1-11.

[1]	WANG Yuheng, LIU Qiang, WU Xiaojie. RCGNN: Robustness certification for graph neural networks under graph injection attacks [J]. Computer Engineering & Science, 2025, 47(3): 434-447.
[2]	LIU Gao, XU Jianliang, ZHANG Xianyi, LIU Xiandong. OpenLM: A multi-platform and high-performance large language model inference framework [J]. Computer Engineering & Science, 2025, 47(12): 2129-2138.
[3]	LI Jiakun, XIE Yulai, FENG Dan. A real-time scheduling algorithm for video processing tasks under cloud-edge collaboration framework [J]. Computer Engineering & Science, 2025, 47(10): 1767-1778.
[4]	WEN Rui-lin, FAN Chun, MA Yin-ping, WANG Zheng-dan, XIANG Guang-yu, FU Zhen-xin. SlurmX:A task scheduling system refactored from Slurm using object oriented methodology [J]. Computer Engineering & Science, 2022, 44(9): 1532-1541.
[5]	LI Wen-jia, SHI Lan, JI Hang-xu, LUO Yi-peng. Research and implementation of a Flink-oriented load balancing task scheduling algorithm [J]. Computer Engineering & Science, 2022, 44(7): 1141-1151.
[6]	LUO Lei, CHEN Zhao-yun, WANG Li-xuan. User QoS-aware deep learning task dynamic scheduling on GPU clusters [J]. Computer Engineering & Science, 2021, 43(8): 1331-1340.
[7]	HUANG Shan, , FANG Liu-yi, , XU Hao-tong, DUAN Xiao-dong, . Task scheduling optimization of Flink in container environment [J]. Computer Engineering & Science, 2021, 43(7): 1173-1184.
[8]	XING Hong-xing, WEI Ye-hua, LE Yi. A hardware cost reduction scheduling algorithm of heterogeneous distributed embedded system [J]. Computer Engineering & Science, 2021, 43(2): 258-265.
[9]	HU Ya-hong1,SHENG Xia2,Mao Jia-fa1. Task scheduling optimization in Spark environment with unbalanced resources [J]. Computer Engineering & Science, 2020, 42(2): 203-209.
[10]	ZHU Yong-chao1,ZHOU Chuan1,CUI Yu-wei2,GUO Jian1,WU Yi-fei1. An improved primary/backup scheduling algorithm based on simulated annealing algorithm [J]. Computer Engineering & Science, 2019, 41(9): 1534-1540.
[11]	WANG Yu-xin,WANG Fei,WANG Guan,GUO He. A MapReduce workflow heterogeneous scheduling algorithm based on two-level DAG model [J]. Computer Engineering & Science, 2019, 41(8): 1353-1359.
[12]	JI Hui,ZHOU Lei. A task scheduling method for network-on-chip temperature optimization [J]. Computer Engineering & Science, 2018, 40(9): 1527-1533.
[13]	TONG Zhao1,2，CHEN Hong-jian1,2，CHEN Ming1,2，MEI Jing1,2,LIU Hong1,2. A hybrid biogeography-based optimization algorithm for task scheduling in cloud computing [J]. Computer Engineering & Science, 2018, 40(5): 765-772.
[14]	MO Wen-dao1，LI Ye-da2，WEN Ang-zhan3，LIN Wei-wei3. A temperatureaware task scheduling algorithm for mobile devices [J]. Computer Engineering & Science, 2017, 39(4): 627-633.
[15]	GUO Hui-yun1,2，FANG Jun1,2，LI Dong1,2. A multi-source streaming data real-time storage system based on load balance [J]. Computer Engineering & Science, 2017, 39(4): 641-647.

GNNSched: A GNN inference task scheduling framework on GPU

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments