• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (02): 256-264.

• 计算机网络与信息安全 • 上一篇    下一篇

基于SAE和WGAN的入侵检测方法研究

刘拥民1,2,许成1,2,黄浩1,2,张钱垒1,2,赵俊杰1,2   

  1. (1.中南林业科技大学电子信息与物理学院,湖南 长沙 410004;
    2.中南林业科技大学智慧林业云研究中心,湖南 长沙 410004)
  • 收稿日期:2023-05-18 修回日期:2023-12-09 接受日期:2025-02-25 出版日期:2025-02-25 发布日期:2025-02-24
  • 基金资助:
    国家自然科学基金(31870532);长沙市科技计划(kq2402265)

Research on intrusion detection method based on SAE and WGAN

LIU Yongmin1,2,XU Cheng1,2,HUANG Hao1,2,ZHANG Qianlei1,2,ZHAO Junjie1,2   

  1. (1.School of Electronic Information and Physics,Central South University of Forestry & Technology,Changsha 410004;
    2.Research Center of Smart Forest Cloud,Central South University of Forestry & Technology,Changsha 410004,China)
  • Received:2023-05-18 Revised:2023-12-09 Accepted:2025-02-25 Online:2025-02-25 Published:2025-02-24

摘要: 近年来,机器学习和深度学习(ML/DL)领域技术飞速发展,将其应用到IDS中的研究也越来越多。但是,目前入侵检测领域的数据集存在特征冗余和攻击分类样本数量不平衡的问题。针对上述问题,提出基于自编码器SAE和生成对抗网络WGAN的网络异常检测方法。首先,针对特征冗余问题,使用堆叠自编码器的编码-隐层-解码思想进行数据降维,细化各类特征,提取更适用于分类的低维度特征。其次,针对样本不平衡(数据量少、种类不多的)问题,将处理过的数据作为生成器的来源输入到WGAN模型中,利用生成对抗网络的生成功能进行样本扩充,弥补分类模型训练过程中某些类型样本数据不足的问题,最终通过RF分类模型进行检测。在数据集NSL-KDD上的实验结果表明,基于本文方法建立的模型SAE-WGAN-RF的F1-Score为95.58%,Recall为96.54%,Precision为96.03%,相比常见的经典算法的性能有显著提高。

关键词: 深度学习, 生成对抗网络, 异常检测, 栈式自编码器

Abstract: In recent years, the rapid development of technologies in the field of machine learning (ML) and deep learning (DL) has led to increasing research on their application in intrusion detection systems (IDS). However, current datasets in the field of intrusion detection face issues such as feature redundancy and an imbalance in the number of samples across different attack categories. To solve these problems, a network anomaly detecting method based on stacked autoencoder (SAE) and Wasserstein generative adversarial network (WGAN) is proposed. Firstly, to address the problem of feature redundancy, this paper employs the encoding-hidden layer-decoding concept of SAEs for data dimensionality reduction. This approach refines various features and extracts lower-dimensional features that are more suitable for classification. Secondly, to tackle the issue of sample imbalance (limited data volume and diversity), the processed data is used as input for the generator in the WGAN model. The generative capabilities of the generative adversarial network are utilized for sample augmentation, thereby compensating for the lack of certain types of samples during the training of the classification model. Finally, the random forest (RF) classification model is used for detection. Experimental results on NSL-KDD dataset show that SAE-WGAN-RF model which based on  the proposed method achieves an F1-Score of 95.58%, Recall of 96.54%, and Precision of 96.03%, representing significant improvements compared to common classical algorithms.

Key words: deep learning, generative adversarial networks, anomaly detection, stack autoencoder