• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2010, Vol. 32 ›› Issue (8): 11-13.doi: 10.3969/j.issn.1007130X.2010.

• 论文 • 上一篇    下一篇

随机分组抽样下子群体的流大小分布估计

张海1,2 ,朱旭阳2 ,郭文明2   

  1. (1.华南理工大学计算机科学与工程学院,广东 广州 510641;2.南方医科大学网络中心,广东 广州 510515)
  • 收稿日期:2009-07-06 修回日期:2009-10-10 出版日期:2010-07-25 发布日期:2010-07-25
  • 通讯作者: 张海
  • 作者简介:张海(1972),男,上海人,副教授,研究方向为网络管理和网络测量;朱旭阳,高级工程师,研究方向为网络管理和软件工程;郭文明,教授,研究方向为网络管理和网络安全。

Flow Size Distribution Estimation of theSubpopulations from Random Packet Sampling

ZHANG Hai1,2,ZHU Xuyang2,GUO Wenming2   

  1.  (1.School of Computer Science and Engineering,South China University of Technology,Guangzhou 510641;
    2.Network Center,Southern Medical University,Guangzhou 510515,China)
  • Received:2009-07-06 Revised:2009-10-10 Online:2010-07-25 Published:2010-07-25
  • Contact: ZHANG Hai

摘要:

随机分组抽样是网络管理和测量中最常见的抽样方法。已有的研究大都集中在此抽样方法下基于总体的流大小分布估计算法,但一些网络应用更关心总体流量中某个子群体的流大小分布。本文将总体的网络流划分成子群体S和子群体的补集,提出了一种在随机分组抽样下运用TCP协议信息的由S与共同组成流大小的联合分布的估计算法。实验证明,该算法能够较好地还原子群体及其在总体下的流大小分布的特征;另一方面,通过运用样本流中TCP协议信息,提高了子群体流大小分布估计算法的准确性。

关键词: 分组抽样, 流大小分布, 网络测量

Abstract:

The random packet sampling is the most common sampling method in network management and measurement. Previous work focuses on estimating the flow size distribution for the complete population of flows from the random packet sampling data. However, there are a number of network applications which focus on the flow size distribution of a particular subpopulation. In this paper, we divide the complete pupulation of flows into two subsets:a subpopulation S  and its complementary set . We propose an algorithm for estimating the flow size joint distribution of Sand  using the TCP protocol imformation from the random sampling data. Experiments are conducted with the real network traces. The results show that the proposed method restores the original characteristics of the flow size distribution of subpopulations under the complete population of flows. Our algorithm also impoves the accuracy of flow size distribution estimation of subpopulations by using the TCP protocol imformation.

Key words: packet sampling;flow size distribution;Internet measurement