• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (06): 994-1002.

• 计算机网络与信息安全 • 上一篇    下一篇

基于深度学习的Webshell检测

车生兵,张光琳   

  1. (中南林业科技大学计算机与信息工程学院,湖南 长沙 410004)

  • 收稿日期:2020-07-14 修回日期:2021-03-02 接受日期:2022-06-25 出版日期:2022-06-25 发布日期:2022-06-17
  • 基金资助:
    国家自然科学基金(31870532)

Webshell detection based on deep learning

CHE Sheng-bing,ZHANG Guang-lin    

  1. (School of Computer and Information Engineering,Central South University of Forestry and Technology,Changsha 410004,China)
  • Received:2020-07-14 Revised:2021-03-02 Accepted:2022-06-25 Online:2022-06-25 Published:2022-06-17

摘要: 以AWD攻防中Webshell检测为背景,在超空间利用模糊C均值聚类分析发现了攻击向量全局稀疏、局部紧密的特点,提出了2种深度学习模型。由于GitHub收集的攻击行为多为随机获取,没有很好的针对性,所以对训练数据的长度进行了限制,并保留了有限的相关样本数量。由于一次攻击与相邻的2~4次操作紧密相关,而且攻击向量垂直方向关联特征明显,水平方向相对稳定,考虑到特征向量在传递过程中规模会减小,增加了卷积层的补零选项。针对深度学习训练曲线中的锯齿振荡现象,证明了Adam优化算法的快速计算公式,并修正了学习参数,不断消除了训练的Loss曲线中的锯齿,使得训练曲线按照指数规律平滑下降,迅速得到需要的训练结果。将目前已有的类似工作与提出的2种深度学习模型进行对比。实验结果表明,提出的的深度学习模型能够很好地检测出AWD中的Webshell攻击。 

关键词: 深度学习, Web安全, Webshell

Abstract: Based on Webshell detection in AWD offensive and defensive competition, fuzzy C-means clustering is used to analyze Webshell in hyperspace, and find that the attack vector is globally sparse and locally closely related. Two deep learning models are proposed for Webshell detection. Since most of the Webshells collected by GitHub are obtained randomly and are not well targeted, the length of the training data is limited and a limited number of relevant samples are retained. Because one attack is closely related to the adjacent 2 to 4 operations, the attack vector has obvious correlation characteristics in the vertical direction, and the horizontal direction is relatively stable, considering that the scale of the feature vector will be reduced during the transfer process, the zero padding of the convolutional layer is increased. Aiming at the sawtooth oscillation phenomenon of the deep learning training curve, the fast calculation formula of the Adam optimization algorithm is proved, and the learning parameters are corrected, which continuously eliminates the sawtooth in the training Loss curve, and maks the training curve drop smoothly according to the exponential law. The training results are obtained soon. Experiments are conducted to compare the two deep learning models with existing similar detection models. The experimental results show that the proposed deep learning models can better detect Webshell attacks in AWD.

Key words: deep learning, Web security, Webshell