• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (04): 638-645.

• Software Engineering • Previous Articles     Next Articles

Automatic code comment generation of Tree2Seq based on attention mechanism

ZHAO Le-le,ZHANG Li-ping,ZHAO Feng-rong   

  1. (College of Computer Science and Technology,Inner Mongolia Normal University,Hohhot 010022,China)
  • Received:2021-05-12 Revised:2021-09-27 Accepted:2023-04-25 Online:2023-04-25 Published:2023-04-13

Abstract: Abstract:Code comments can help developers quickly understand code and reduce code maintenance costs. In order to preserve the structure information of the code, the classical Seq2Seq model will compress the structure information of the code into sequences, resulting in the loss of the structure information. A Tree-LSTM encoder is proposed to directly transform the code into an abstract syntax tree for encoding, so that the comments generation model can effectively obtain the structure information of the code and improve the effect of comments generation. The Tree2Seq model based on attention mechanism is adopted to realize the code comments generation task, which avoids the situation that the encoder compresses all input information into a fixed vector, resulting in partial information loss. The experiments are carried out on two programming language datasets, Java and Python. Three automatic evaluation indexes commonly used in machine translation are used for evaluation and verification, and some test data are selected for manual evaluation. Experimental results show that Tree2Seq model based on attention mechanism can provide more comprehensive and rich semantic structure information for decoder, and provide guidance for subsequent experimental analysis and improvement.. 

Key words: code comment, automatic generation, attention mechanism, Tree2Seq