• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (12): 2286-2295.

• 论文 • 上一篇    下一篇

基于排队网络的多优先级MapReduce作业调度算法

万聪,王翠荣,王聪,吕艳霞,贾朔   

  1. (东北大学信息科学与工程学院,辽宁 沈阳 110819)
  • 收稿日期:2014-08-29 修回日期:2014-11-06 出版日期:2014-12-25 发布日期:2014-12-25
  • 基金资助:

    国家自然科学基金资助项目(61300195);辽宁省教育厅科学研究一般资助项目(L2013099);中央高校基本科研业务费资助项目(N110323009)

A MapReduce scheduling algorithm supporting
multiple priorities based on queuing network          

WAN Cong,WANG Cuirong,WANG Cong,L Yanxia,JIA Shuo   

  1. (College of Information Science and Engineering,Northeastern University,Shenyang 110819,China)
  • Received:2014-08-29 Revised:2014-11-06 Online:2014-12-25 Published:2014-12-25

摘要:

MapReduce是一个能够对大规模数据进行分布式处理的框架,目前被各个领域广泛应用。在提供MapReduce服务的集群中,如何保证不同优先级用户的截止时间限定是MapReduce作业调度问题的一个挑战。针对这一问题,提出了一个基于排队网络的多优先级作业调度算法(MPSA)。首先分析和归纳了基于MapReduce模型的算法,提出了三种常见模式,采用Jackson排队网络对基于MapReduce模型的算法建立了数学模型,应用该网络模型可以求出不同优先级队列对资源的需求;随后使用AR(1)模型进行预测,使算法可以动态地适应不同的用户访问量;利用二分查找算法,分步计算出不同优先级在map阶段和reduce阶段分配的槽位数;最后实现了在MapReduce模型中应用的实时调度算法。实验结果表明,与传统的FIFO和公平调度算法相比,本文提出的算法在用户到达率和任务规模变化的情况下,可以更加有效地满足不同优先级用户的截止时间限定。

关键词: 云计算, 排队网络, MapReduce, 调度算法

Abstract:

MapReduce is a distributed computing framework for big data processing, which has been widely used in various fields. It’s a challenge to ensure the deadline of different priority users in the cluster providing MapReduce services. To solve this problem, a queuing network based multipriority scheduling algorithm (MPSA) is proposed. Firstly, the MapReduce based algorithms are summarized and analyzed, three common patterns are proposed, and the Jackson queuing network is used to build a mathematic model of the MapReduce based algorithms. The mathematic model can be used to find the resource demands of different priority queues. Secondly, the AR(1) model is used to predict the numbers of accessing users, and the binary search algorithm is used to calculate the assigned slot numbers of different priority users in map phase and reduce phase. Finally, a real time scheduling algorithm running in the MapReduce framework is implemented. Experimental results show that, compared with the traditional FIFO and fair scheduling algorithm, the proposed scheduling algorithm can ensure the  defined deadlines of different priority users more effectively when the user arrival rates and the task scales change.

Key words: cloud computing;queuing network;MapReduce;scheduling algorithm