• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (10): 1846-1853.

• 论文 • Previous Articles     Next Articles

Dynamic data distribution for stream processing system       

WANG Chengzhang1,LIN Xuelian1,TAN Jingfang2   

  1. (1.School of Computer Science and Engineering,Beihang University,Beijing 100191;
    2.School of Physics and Electronic Engineering,Taishan University,Taian 271021,China)
  • Received:2014-06-11 Revised:2014-08-24 Online:2014-10-25 Published:2014-10-25

Abstract:

In stream processing systems,data skew often leads to load imbalance among computing nodes,thereby increases the response time of data process.Traditional load balancing methods such as operator distribution,operator migration and load shedding have never been widely applied in stream processing systems because of a relatively high performance penalty.Considering the characteristics of stream processing systems, a new load balancing mechanism is proposed. In this mechanism, the data on computing units are split into some sections,and each section can be allocated and migrated dynamically among computing units.Then,for the purpose of load balancing, the input streams and utilizations are balanced among computing units by adjusting sections with few disturbances on steam processing systems. Based on this,we design and implement a load balancing algorithm as well as an online data migration method.The experimental results show that our mechanism can reduce the average latency of data processing and improve the system throughput significantly.

Key words: data stream;stream processing;load balancing;data distribution;data migration