Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (07): 1324-1330.
Previous Articles
CHEN Jie,FENG Xiu-fang,CHEN Yong-le
Received:
Revised:
Accepted:
Online:
Published:
Abstract: In order to find the true authors of source codes in the corpus, this paper proposes a method of combining code coupling degree and program dependency graph (PDG) features to identify the authors of different program source codes. Firstly, the parameters, fan-in and fan-out features extracted from the source code are used to calculate the coupling degree of the code. Secondly, control and data dependencies are extracted from the converted program dependency graph, preprocessing technology is applied to convert PDG features into small instances with frequency details, and the frequency inverse document frequency technology is used to amplify the importance of each PDG feature in the source code. Finally, the CPNN model is used to predict the coding style characteristics of programmers, and the attributes of the real authors of the coding style are divided. The results show that the author attribution prediction on the source code data set of 1000 programmers has an accuracy of 95%.
Key words: coupling degree, program dependency graph, authorship attribution
CHEN Jie, FENG Xiu-fang, CHEN Yong-le. A source code authorship attribution prediction method based on code coupling degree and PDG features [J]. Computer Engineering & Science, 2021, 43(07): 1324-1330.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2021/V43/I07/1324