• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 计算机网络与信息安全 • 上一篇    下一篇

基于机器学习的流量识别技术综述与展望

赵双,陈曙晖   

  1. (国防科技大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2017-10-24 修回日期:2018-01-14 出版日期:2018-10-25 发布日期:2018-10-25
  • 基金资助:

    国家自然科学基金(61379148)

Review:Traffic identification based on machine learning

ZHAO Shuang,CHEN Shuhui   

  1. (College of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2017-10-24 Revised:2018-01-14 Online:2018-10-25 Published:2018-10-25

摘要:

流量识别是实现网络管理与网络安全的关键环节。随着基于端口号和深度包检测两种流量识别方法相继失效,基于机器学习的流量识别技术成为近十年流量识别领域最受关注的方法。鉴于流量识别技术的重要性,首先介绍流量识别技术的概况及相关基本概念,包括其应用场景、输入对象、识别类型及评价指标。然后详述机器学习背景下,流量识别过程中的数据集获取、特征提取与选择、识别模型设计等关键技术的进展,并对近年主要研究工作进行总结和比较。最后对基于机器学习的流量识别技术面临的主要挑战及未来的发展方向进行探讨与展望。
 

关键词: 流量识别, 机器学习, 网络测量, 流量数据集

Abstract:

Traffic identification is an essential stage for network management and security. As the effectiveness of portnumberbased techniques and deep packet inspection techniques is diminishing, machine learning based traffic identification has become particularly notable in the past decade. Given the importance of traffic identification, we first give a brief overview of traffic identification techniques and the basic concepts concerned, including application scenarios, input objects, identification types and evaluation metrics. Then, in the context of machine learning, we detail the development of key techniques, such as data sets acquisition, features extraction and selection, and identification model design. Additionally, we summarize and compare recent mainstream studies. Finally, we discuss the major challenges and prospects of machine learning based traffic identification.
 

Key words: traffic identification, machine learning, network measurement, traffic data set