• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 计算机网络与信息安全 • 上一篇    下一篇

基于Transformer的DGA域名检测方法

张鑫,程华,房一泉   

  1. (华东理工大学信息科学与工程学院,上海 200237)
  • 收稿日期:2019-04-30 修回日期:2019-10-12 出版日期:2020-03-25 发布日期:2020-03-25
  • 基金资助:

    赛尔网络下一代互联网技术创新项目(NGII20170520)

A DGA domain name detection method based on Transformer

ZHANG Xin,CHENG Hua,FANG Yi-quan   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
     
  • Received:2019-04-30 Revised:2019-10-12 Online:2020-03-25 Published:2020-03-25

摘要:

已有DGA检测方法已经获得了较高的检测精度,但在缩略域名上存在误报率高的问题。主要原因是缩略域名字符间随机性高,现有检测方法从随机性角度很难有效地区分缩略域名和DGA域名。在分析了缩略域名的字符特性后,基于自注意力机制实现了域名字符依赖性的检测;并采用LSTM改进了Transformer模型的编码方式,以更好地捕获域名中字符位置信息;基于Transformer模型构建了DGA域名检测方法(MHA)。实验结果表明,MHA可以有效地区分出DGA域名和缩略域名,得到了更高的精确率和更低的误报率。

 

关键词: 缩略域名, Transformer模型, 自注意力机制, 字符依赖性

Abstract:

Existing DGA detection methods have achieved high detection accuracy, but there is a problem of high false alarm rate in abbreviated domain names. The main reason is that the abbreviated domain names have high randomness among characters and it is difficult for the existing detection methods to distinguish abbreviated domain names from DGA domain names. After analyzing the character characteristics of the abbreviated domain names, the detection of domain name character dependence is realized based on self-attention mechanism. Then, LSTM is used to improve the encoding way of Transformer model to better capture the location information of characters in domain names. A DGA domain name detection method (MHA) is constructed based on Transformer model. Experimental results show that the algorithm can effectively distinguish DGA domain names from abbreviated domain names, and get higher accuracy and lower false alarm rate.

 

 

 
 

Key words: abbreviated domain name, transformer model, self-attention mechanism, character dependence