• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (10): 1775-1792.

• 计算机网络与信息安全 • 上一篇    下一篇

基于图神经网络的源代码漏洞检测研究综述

陈子雄1,陈旭1,景永俊1,宋吉飞2   

  1. (1.北方民族大学计算机科学与工程学院,宁夏 银川 750021;2.国家(中卫)新型互联网交换中心,宁夏 中卫 755000)
  • 收稿日期:2024-01-03 修回日期:2024-03-06 接受日期:2024-10-25 出版日期:2024-10-25 发布日期:2024-10-29
  • 基金资助:
    宁夏回族自治区重点研发项目(2023BDE02017);北方民族大学中央高校基本科研业务费专项资金(2022PT_S04)

A survey of source code vulnerability detection research based on graph neural networks

CHEN Zi-xiong1,CHEN Xu1,JING Yong-jun1,SONG Ji-fei2   

  1. (1.School of Computer Science and Engineering,North Minzu University,Yinchuan 750021;
    2.National (Zhongwei) New-type Internet Exchange Point,Zhongwei 755000,China)
  • Received:2024-01-03 Revised:2024-03-06 Accepted:2024-10-25 Online:2024-10-25 Published:2024-10-29

摘要: 随着开源软件在各个领域的广泛应用,源代码漏洞已经导致了一系列严重的安全问题。鉴于这些漏洞对计算机系统的潜在威胁,检测软件中的源代码漏洞以防止网络攻击已成为一个重要的研究领域。为了实现自动化检测并降低人力成本,研究人员提出了许多基于传统深度学习的方法。然而,这些方法大多将源代码视为自然语言序列而没有充分考虑代码的结构信息,因此其检测效果受到了限制。近年来,基于代码图表示和图神经网络的源代码漏洞检测方法应运而生。全面综述了图神经网络在源代码漏洞检测中的应用,并提出了一个基于图神经网络的源代码漏洞检测通用框架。从文件级别、函数级别和切片级别3种漏洞检测粒度出发,系统地总结和阐述了现有的方法和相关数据集。最后,讨论了该领域所面临的挑战,并对未来可能的研究重点进行了展望。

关键词: 图神经网络, 漏洞检测, 数据集, 数据流图, 控制流图

Abstract: With the widespread application of open-source software across various domains, source code vulnerabilities have led to a series of serious security issues. Given the potential threats these vulnerabilities pose to computer systems, detecting source code vulnerabilities in software to prevent network attacks is a crucial research area. To achieve automated detection and reduce human labor costs, researchers have proposed numerous traditional deep learning-based methods. However, these methods mostly treat source code as natural language sequences and do not adequately consider the structural information of the code, limiting their detection effectiveness. In recent years, methods for detecting source code vulnerabilities based on code graph representation and graph neural networks have emerged. This paper provides a comprehensive review of the application of graph neural networks in source code vulnerability detection and proposes a general framework for source code vulnerability detection based on graph neural networks. Starting from three levels of vulnerability detection granularity: file-level, function-level, and slice-level, the existing methods and relevant datasets are systematically summarized and elucidated. Finally, the challenges faced by this field are discussed, and potential research directions for the future are outlined.

Key words: graph neural networks, vulnerability detection, datasets, data flow graph, control flow graph