• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (12): 2169-2176.

• 计算机网络与信息安全 • 上一篇    下一篇

基于多粒度语义分析的二进制漏洞搜索方法

刘豪,马慧芳,龚楠,闫彩瑞   

  1. (西北师范大学计算机科学与工程学院,甘肃 兰州 730070)
  • 收稿日期:2020-11-02 修回日期:2021-03-03 接受日期:2021-12-25 出版日期:2021-12-25 发布日期:2021-12-31
  • 基金资助:
    国家自然科学基金(61762078,61363058,61966004);广西可信软件重点实验室研究课题(kx202003);甘肃省自然科学基金(21JR7RA114)

A binary vulnerability search method based on multi-granularity semantic analysis#br# #br#

LIU Hao,MA Hui-fang,GONG Nan,YAN Cai-rui   

  1. (School of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2020-11-02 Revised:2021-03-03 Accepted:2021-12-25 Online:2021-12-25 Published:2021-12-31

摘要: 二进制文件相似度检测旨在通过比较来自不同平台、编译器、优化配置甚至是不同软件版本的2个二进制文件的相似程度来判断二者是否高度相似,其中二进制漏洞搜索为其在信息安全领域的应用之一。二进制漏洞的产生为现代软件应用带来了诸多问题,如操作系统易受攻击、隐私信息易被窃取等。二进制漏洞产生的主要原因是软件开发过程中进行了代码复用却没有进行严格的监管。据此,提出了一种基于多粒度语义特征分析的二进制漏洞搜索方法Taurus,该方法通过3种粒度的语义特征来搜索跨平台的潜在二进制漏洞。给定待检测二进制文件和漏洞数据库,需要对其与漏洞数据库中的每个二进制漏洞进行逐一搜索。首先,分别对2个二进制文件进行语义提取,以获取二者在基本块、函数和模块3个粒度下的语义特征,并执行相似度计算;然后,整合3种粒度下语义特征的相似度,以计算3种文件的整体相似度得分;最后,将待检测二进制文件与漏洞数据库中所有漏洞的相似度得分结果进行降序排序,便获得了该二进制文件的搜索结果报告。经过合理配置下的实验对比,结果表明,
Taurus方法在准确性方面要优于基线方法。


关键词: 漏洞搜索, 多粒度语义特征, 跨平台

Abstract: Similarity detection of binary files aims to judge whether the two binary files from different platforms, compilers, optimized configurations, and even different software versions are highly similar. Binary vulnerability search is one of its applications in the field of information security. The emergence of binary vulnerabilities has brought many problems to modern software applications, such as the vulnerability of operating systems to attacks, and the vulnerability of private information to theft. The main reason that codes are reused in the software development process without strict supervision. Based on this, a binary vulnerability search method Taurus based on multi-granular semantic analysis is proposed. This method uses three granular semantic features to search for potential cross-platform binary vulnerabilities. Given a binary file to be detected and a vulnerability database, it is necessary to search each binary vulnerability in the vulnerability database one by one. Firstly, semantic extraction is performed on two binary files respectively to obtain the semantic features of the two at three granularities of basic block, function and module, and similarity calculation is conducted. Secondly, the similarities of semantic features at the three granularities are integrated to calculate the overall similarity scores of the three files. Finally, the similarity score results of the binary files to be detected and all the vulnerabilities in the vulnerability database are sorted in descending order, and the search result report of the binary file is obtained. Comparative experiments under reasonable configuration show that the proposed Taurus method is better than the baseline method in terms of accuracy. 


Key words: vulnerability search, multi-granularity semantic feature, cross-platform