• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Workload characterization and task scheduling
optimization of co-located Internet data centers

WANG Ji-wei1,GE Zhe-feng1,JIANG Cong-feng1,ZHANG Ji-lin1,#br# YU Jun1,LIN Jiang-bin2,YAN Long-chuan3,REN Zu-jie4,WAN Jian5   

  1. (1.School of Computer Science and Technology,Hangzhou Dianzi University,Hangzhou 310018;
    2.Alibaba Cloud Computing Co.,Ltd.,Hangzhou 311121;
    3.State Grid Electrical Information Communication Co.,Ltd.,Beijing 100053;4.Zhijiang Laboratory,Hangzhou 311121;
    5.School of Information and Electronic Engineering,Zhejiang University of Science and Technology,Hangzhou 310023,China)
  • Received:2019-08-18 Revised:2019-10-21 Online:2020-01-25 Published:2020-01-25

Abstract:

Modern Internet Data Centers (IDCs) are facing challenges in terms of energy consumption, reliability, management ability, and scalability, when their sizes increase gradually. Currently, IDCs carry a variety of services including online web services and offline batch processing jobs. Online jobs require lower latency, while offline jobs require higher throughput. In order to improve server utilization and reduce energy consumption, IDCs often deploy online and offline jobs in the same computing cluster. In the co-located scenario, how to meet the different requirements of online and offline jobs at the same time is the key challenge. This paper analyzes the Alibaba co-located cluster trace data (cluster-trace-v2018), which includes the data traces from 4034 machines during 8 days. Based on static configuration, dynamic co-located run-time status, and DAG (Directed Acyclic Graph) dependency structure of offline batch jobs, the co-located workloads including the relationship between task skew and container distribution are characterized. Based on the task dependencies and critical paths, a corresponding task scheduling optimization strategy is proposed.
 

Key words: co-located data center, workload characterization, online service, batch job, scheduling