• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (7): 24-28.

• 论文 • 上一篇    下一篇

改进的字节频度负载异常入侵检测方法

翁广安1,余胜生2,周敬利2   

  1. (1.华中科技大学文华学院计算机系,湖北 武汉 430074;2.华中科技大学计算机学院,湖北 武汉 430074)
  • 收稿日期:2011-08-25 修回日期:2011-11-01 出版日期:2012-07-25 发布日期:2012-07-25
  • 基金资助:

    校级自然科学基金资助项目(j02005302)

A Modified Approach for Byte Frequency based Payload Anomaly Intrusion Detection

WENG Guangan1,YU Shengsheng2,ZHOU Jingli2   

  1. (1.Department of Computer Science,Wenhua School,
    Huazhong University of Science and  Technology,Wuhan 430074;
    2.School of Computer Science,Huazhong University of Science and  Technology,Wuhan 430074,China)
  • Received:2011-08-25 Revised:2011-11-01 Online:2012-07-25 Published:2012-07-25

摘要:

数据集内容的特性对基于负载的网络异常入侵检测系统准确度有很大影响。本文分析了训练集数据包之间的内容特性差异对基于字节频度分布的模型的影响,较大的差异可能会导致分组计算频度均值的模型产生较高的误报率。本文据此提出了一种改进的模型—单包频度分布模型,以单个数据包的频度分布特征构成正常行为集,并以聚类方法控制其规模。在模拟数据集和DARPA99数据集上的实验表明,训练集数据包内容特性的差异确实导致基于均值的字节频度模型产生更多的误报,单包频度分布模型则不受影响,它有更高的检测准确度,在同等检测率下误报率更低。在数据包相互完全不同的情况下,基于均值的模型甚至失效。可认为单包频度分布模型对具有丰富动态内容的网络服务将有良好的适应能力。

关键词: 网络入侵检测系统, 字节频度分布, 负载异常检测, 模拟数据集

Abstract:

The content characteristics of datasets have strong effect on the detection accuracy of network anomaly intrusion detection systems. The influences impacted on byte frequency distribution based models by the differences between content characteristics of the training packets are analyzed, revealing that those differences would lead the models calculating the average frequency of grouped packets to a higher false alarm rate. Based on this, a modified model named single packet frequency distribution is proposed, which uses the frequency distribution data of the unitary packet to form normal profiles instead of using their average values, and controlls the size of that normal set by clustering techniques. Experiments are  carried out respectively on the simulation dataset and the DARPA99 real network dataset. The results indicate that the great difference between packet contents in deed makes the average byte frequency value based models generating more false alarms, whereas the single packet frequency distribution model is not affected by that, and it gets higher detection accuracy, generating an equal detection rate with the lower false alarm rate. The average value based model even becomes invalid at the worst case. The single packet frequency distribution model can be considered having good adaptability to those network services with rich dynamic contents.

Key words: NIDS;byte frequency distribution;payload anomaly detection;simulation dataset