• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2020, Vol. 42 ›› Issue (07): 1191-1196.

• 软件工程 • 上一篇    下一篇

面向高级别代码克隆检测方法的设计与实现

邹悦, 吴鸣, 徐云


  

  1. (1.中国科学技术大学计算机学院,安徽 合肥 230027;2.安徽省高性能计算重点实验室,安徽 合肥 230026)

  • 收稿日期:2019-12-23 修回日期:2020-02-24 接受日期:2020-07-25 出版日期:2020-07-25 发布日期:2020-07-24
  • 基金资助:
    国家自然科学基金(61672480)

Design and implementation of a high  level code clone detection method

ZOU Yue, WU Ming, XU Yun   

  1. (1.School of Computer Science,University of Science and Technology of China,Hefei 230027;

    2.Key Laboratory of High Performance Computing of Anhui Province,Hefei 230026,China)



  • Received:2019-12-23 Revised:2020-02-24 Accepted:2020-07-25 Online:2020-07-25 Published:2020-07-24

摘要: 代码克隆检测是软件工程中的基础研究,在软件分析和维护方面有着广泛应用。目前对于有文本差异的高级别(即学术界定义的级别3和级别4)克隆检测,现有方法存在检出率(回收率)不高的问题。基于程序依赖图PDG的检测方法是高级别克隆检测的一类重要方法,但这类方法依赖子图同构的精确图匹配算法,算法时间复杂度高且回收率较低。为此,提出了一种新的高级别代码克隆检测方法,使用基于 Weisfeiler-Lehman图核的非精确图匹配算法进行代码克隆检测。实验结果表明,与已有的代码克隆检测方法相比,
该方法可以检出更多的高级别克隆且计算时间较短。

关键词: 代码克隆检测, 程序依赖图, Weisfeiler-Lehman图核

Abstract: Code clone detection is a basic research in software engineering, and it is widely used in software analysis and maintenance. At present, for detecting high-level clone with text difference, namely type-3/type-4 clone defined in the academic field, the existing methods have the problem of low detection rate (recall rate). The PDG (Program Dependency Graph) based detection methods are very important in high-level clone detection area, but these methods mostly rely on the accurate graph matching algorithms such as subgraph isomorphism, which have high time complexity and low recovery. Therefore, we propose a novel high-level code clone detection method, which uses the approximate graph matching algorithm based on Weisfeiler -Lehman graph kernel to detect clones. The experimental results show that our method can detect more high-level clones and run faster than the existing methods.


Key words: code clone detection, program dependency graph, Weisfeiler-Lehman graph kernel