• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (09): 1567-1573.

• 计算机网络与信息安全 • 上一篇    下一篇

大规模数据流统计中冷热流替换策略优化

乔冠杰,吕高锋,谭靖,莫露莎   

  1. (国防科技大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2020-10-11 修回日期:2021-01-13 接受日期:2021-09-25 出版日期:2021-09-25 发布日期:2021-09-27
  • 基金资助:
    国家重点研发计划(2018YFB1800505)

Optimization of cold and hot flows replacement in large-scale data flow statistics

QIAO  Guan-jie,L Gao-feng,TAN Jing,MO Lu-sha   

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
  • Received:2020-10-11 Revised:2021-01-13 Accepted:2021-09-25 Online:2021-09-25 Published:2021-09-27

摘要: 对大规模数据流统计的问题进行了研究,针对大流统计的典型结构Elastic Sketch替换策略中存在的问题进行优化,优化策略解决了冷流被误判为热流插入重部的问题。针对重部中保存的不一定是最大流的问题进行优化,提出了基于最大值和组相连的替换策略,保证了存储在重部的一定是最大的流,提高了大流统计的精度,同时大大降低了热碰撞发生的概率。相比于传统的测量统计方法,在提高了测量精度的同时,减少了内存占用。


关键词: 大规模数据流统计, sketch, 冷流热流分离

Abstract: This paper studies the problem of large-scale data flow statistics, and optimizes the problems in the replacement strategy of Elastic Sketch, which is a typical structure of large flow statistics. The optimization strategy solves the problem of cold flow being misjudged as hot flow inserted into the heavy part. In order to optimize the problem that the flow stored in the heavy part may not be the largest flow, a replacement strategy based on the maximum value and group connection is proposed to ensure that the largest flow stored in the heavy part is guaranteed to improve the accuracy of the large flow statistics. At the same time, the probability of thermal collisions is greatly reduced. Compared with the traditional measurement statistics method, the measurement accuracy is improved while the memory usage is reduced.





Key words: a large-scale data flow statistical, sketch, cold and hot flows separation