• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A data flow programming model and
compiler optimization for Storm
 

YANG Qiuji,YU Junqing,MO Binsheng,HE Yunfeng   

  1. (Center of Network and Computation,Huazhong University of Science and Technology,Wuhan 430074,China)
  • Received:2016-08-15 Revised:2016-10-09 Online:2016-12-25 Published:2016-12-25

Abstract:

As a domain specific programming model, data flow programming combines the features of media applications and programming languages and offers an attractive way to express the parallelism. However, the hierarchical storage structure of the multicore cluster architecture incurs new challenges to the performance of data flow applications. Besides, the programmability remains a significant challenge for the compiler. Aiming at the problems the data flow programming model facing in processing the big data of digital media field, we design and implement an integration of a data flow programming model and a distributed computing framework, and propose a compiler optimization framework for Storm based on COStream. The compiler optimization method for Storm includes two steps: hierarchical task partition and scheduling for Storm, and pipeline scheduler and code generation for Storm. The hierarchical task partition and scheduling is used to assign the tasks to the multicore cluster nodes within the cluster, which can ensure a workload balance between multiple cores with small inter cluster communication overhead. The pipeline scheduler and code generation are used to build software pipelines between cluster nodes and between cores in a node, and generate the corresponding object code. We conduct experiments on a multicore cluster as the target platform, build the Storm distributed architecture in the cluster, choose typical digital media processing program as the benchmarks, evaluate and analyze the optimization performance for Storm. Experimental results verify the effectiveness of the proposed model.

Key words: muti-core cluster, data flow programming, compiler, pipeline, COStream