A QoS (Quality of Service)-aware deep learning task dynamic scheduling method on GPU clusters is proposed. The offline evaluation module is used to perform offline evaluation of deep learning tasks and build a computational performance prediction model. Based on the performance prediction model, combined with the expected QoS of the task, the online scheduling module carries out the scheduling of task placement and task execution sequence. Experiments on a distributed GPU cluster demonstrate that the proposed method can achieve higher QoS-guarantee percentage and cluster resource utilization than other baseline schedulers.