• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (10): 110-115.

• 论文 • Previous Articles     Next Articles

Efficient Topk query algorithm on massdistributed data      

WEI Xianquan,ZHENG Hongyuan,DING Qiulin   

  1. (College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)
  • Received:2013-05-20 Revised:2013-08-20 Online:2013-10-25 Published:2013-10-25

Abstract:

For solving the shortage of existing distributed Topk query algorithms, a novel topk algorithm (named ECHT algorithm) is proposed, which is appropriate for massive distributed data. Taking care of the data distribution, ECHT algorithm designs a new algorithm of errorlimited histogram. For one thing, it avoids poor performance on uneven data distribution. For the other, it improves the accuracy of the threshold value, thus further reducing network bandwidth consumption. In addition, ECHT performs early clipping. Clipping before the transmission of large amounts of data priors brings better performance due to avoiding a lot of useless data transmission. The experiments are performed with the real datasets, demonstrating the viability and superior performance of the new algorithm.

Key words: massive data;Topk;early clipping;new error limited histogram