• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (10): 1815-1824.

• Software Engineering • Previous Articles     Next Articles

Code plagiarism detection based on graph neural network

CHEN Chang-feng1,ZHAO Hong-zhou1,ZHOU Kai-qing2   

  1. (1.College of Computer Science and Engineering,Jishou University,Jishou  416000;
    2.School of Communication and Electronic Engineering,Jishou University,Jishou  416000,China)
  • Received:2023-05-03 Revised:2023-12-25 Accepted:2024-09-25 Online:2024-10-25 Published:2024-10-29

Abstract: As open-source data becomes increasingly accessible, the cost of code plagiarism has decreased, significantly impacting the healthy development of the software industry. Addressing the limitation of existing plagiarism detection methods, which struggle to deeply mine the semantic and structural information of source code, leading to suboptimal semantic plagiarism detection results, this paper introduces a graph neural network-based code plagiarism detection method. This method uses graph neural networks to effectively represent the characteristics of source code, including semantic and structural information, and employs graph attention networks to enhance these features. Furthermore, it utilizes neural tensor networks to obtain similarity vectors between different source codes. Finally, a fully connected network calculates the similarity between different source codes. Meanwhile, the dropout mechanism is incorporated to balance neuron weights, optimize model design, and prevent overfitting. To validate the effectiveness of the proposed method, experiments were conducted on an OJ system dataset, and the results were compared with those of current popular detection methods. The experimental results demonstrate that the proposed method achieves better performance.

Key words: code plagiarism detection, deep semantic and structural information extraction;graph neural network, graph attention network, feature enhancement