• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A MapReduce workflow heterogeneous scheduling
algorithm based on two-level DAG model
 

WANG Yu-xin,WANG Fei,WANG Guan,GUO He   

  1. (School of Computer Science and Technology,Dalian University of Technology,Dalian 116023,China)
     
  • Received:2018-12-01 Revised:2019-02-11 Online:2019-08-25 Published:2019-08-25

Abstract:

The MapReduce programming model is widely applied in big data processing platforms, and an effective task scheduling algorithm is critical to the efficiency of the model. In our approach, a MapReduce workflow is decomposed as a number of jobs with successive qualifying relationships and each job has a Map phase and a Reduce phase that both contain multiple tasks. Based on the available resources and task heterogeneity of computing cluster, we construct a  two-level directed acyclic graph (DAG) model for job and tasks, and propose a MapReduce workflow heterogeneous scheduling algorithm based on two level priority ordering (2-MRHS). In the first stage of the algorithm, the priority ordering is performed: the priority weights of the job level and task level are calculated respectively to form the scheduling queue of tasks. In task assignment stage, the data block subtasks of each task are assigned to the appropriate computing node according to the tasks' earliest finish time (EFT). A large number of randomly generated DAG models are used to conduct experiments and the results show that our algorithm has shorter scheduling length (makespan) and better stability than those of others.
 

Key words: MapReduce, workflow, heterogeneous computing, task scheduling