• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (07): 1207-1215.

• Computer Network and Znformation Security • Previous Articles     Next Articles

A network traffic classification method based on clustering and noise

PANG Xing-long,ZHU Guo-sheng,YANG Shao-long,LI Xiu-yuan   

  1. (School of Computer and Information Engineering,Hubei University,Wuhan 430062,China)
  • Received:2021-10-12 Revised:2021-12-14 Accepted:2022-07-25 Online:2022-07-25 Published:2022-07-25

Abstract: Because the real network traffic data inevitably cause wrong labeling in label labeling, the label data are inevitably polluted by noise, that is, the observed label of the sample is different from the real label. In order to reduce the negative impact of noise labels on the classification accuracy of the classifiers, this experiment considers two cases of wrong labeling: wrong labeling of correct label type and wrong spelling of label type. A network traffic classification method based on label noise correction is proposed. The method uses clustering and weight division to evaluate and repair the observation samples, and experiments are carried out on two network traffic datasets. The experimental results show that, compared with the three tag noise repair algorithms STC, CC and ADE, the proposed repair algorithm has a certain improvement on the final classification results under the interference of different noise proportions. On the NSL-KDD data set, the average tag correction rates are increased by 23.00%, 7.58% and 2.05% respectively; Similarly, on the MOORE data set, the average correction rates of tags are increased by 35.12%, 10.40% and 4.71% respectively. The proposal has good classification stability in the final classification model.

Key words: noisy label, network traffic classification, K-means clustering, label repair