• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (8): 1-7.

• 论文 •    下一篇

云计算环境下基于关联量的数据部署与任务调度

郭力争1,2,赵曙光1,姜长远1   

  1. (1. 东华大学信息学院,上海 201620;2.河南城建学院计算机科学与工程系,河南 平顶山 467036)
  • 收稿日期:2012-10-29 修回日期:2013-01-07 出版日期:2013-08-25 发布日期:2013-08-25
  • 基金资助:

    国家自然科学基金资助项目(70971020);河南省教育厅科学技术研究重点项目(12A520006)

Data placement and task scheduling based on
associated amount in cloud computing

GUO Lizheng1,2,ZHAO Shuguang1,JIANG Changyuan1   

  1. (1.College of Information Science and Technology,Donghua University,Shanghai 201620;
    2.Department of Computer Science and Engineering,Henan University of Urban Construction,Pingdingshan 467036,China)
  • Received:2012-10-29 Revised:2013-01-07 Online:2013-08-25 Published:2013-08-25

摘要:

科学工作流处理的问题复杂,依赖于集群或网格平台,云计算的出现为科学工作流又提供了一个可供选择的平台;云计算环境下数据密集型应用的科学工作流处理和传输的数据量巨大,减少数据中心不同集群间数据的传输次数和传输量是个挑战性的问题。科学工作流要处理的数据间存在依赖关系,基于数据间的依赖关系最大关联量建立关联矩阵,通过键能算法对关联矩阵进行聚类,把最大相关的数据聚集到一起,然后通过K分割方法,把聚类矩阵分割为k个部分,每个部分部署到数据中心相关的集群里。仿真结果表明,本方法能有效地减少数据中心不同集群间数据的移动次数和移动量。

关键词: 云计算, 关联量, 键能算法, 数据部署, 任务调度

Abstract:

In scientific workflows, cluster or grid platform is used to deal with complex problems, and the emergence of cloud computing offers an alternative. The transferred data are huge in the dataintensive scientific workflows in cloud computing. Reducing the count and amount of the data transfers between different clusters in a datacenter is a challenging problem. There exists dependency in data of the scientific workflow. Firstly, a relational matrix is built based on the associate amount. Secondly, the relational matrix is clustered using bond energy algorithm. Thirdly, the clustered matrix is partitioned to k parts and each part is deployed to corresponding clusters in the datacenter. The simulation results show that the proposed method reduces the count and amount of the data transfers between clusters in a datacenter.

Key words: cloud computing;associated amount;bond energy algorithm;data placement;task scheduling