• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (05): 932-937.

• 论文 • 上一篇    下一篇

一种基于主题模型的软件缺陷预测技术研究

张泽涛1,2,叶立军1,2,程伟1,2,顾军1,2   

  1. (1.上海市空间智能控制技术重点实验室,上海 201109;2.上海航天控制技术研究所,上海 201109)
  • 收稿日期:2015-02-02 修回日期:2015-12-03 出版日期:2016-05-25 发布日期:2016-05-25

A software defect prediction method based on topic model       

ZHANG Zetao1,2,YE Lijun1,2,CHENG Wei1,2,GU Jun1,2   

  1. (1.Shanghai Key Laboratory of Aerospce Intelligent Control Technology,Shanghai 201109;2.Shanghai Insitute of Spaceflight Control Technology,Shanghai 201109,China)
  • Received:2015-02-02 Revised:2015-12-03 Online:2016-05-25 Published:2016-05-25

摘要:

软件缺陷预测通常针对代码表面特征训练预测模型并对新样本进行预测,忽视了代码背后隐藏的不同技术方面和主题,从而导致预测不准确。针对这种问题,提出了一种基于主题模型的软件缺陷预测方法。将软件代码库视为不同技术方面和主题的集合,不同的主题或技术方面有不同的缺陷倾向。采用LDA主题模型对不同主题及其缺陷倾向进行建模,根据建模结果计算主题度量,并将传统度量方式和主题度量结合进行模型训练和预测。实验结果显示,该方法相对传统的软件缺陷预测技术有高的准确性,并且可以在软件演化中保证模型相对稳定,可以适用于各种缺陷预测任务。

关键词: 主题模型;缺陷预测;软件工程

Abstract:

Traditional models for defect prediction always consider the textual features of source codes, comments, etc, ignoring hidden topics such as technical aspects, business logics, etc. To solve these problems, we present a new topicbased defect prediction model. The software corpus is assumed to be composed by a collection of different topics and technical aspects which lead to different defect tendencies. A set of topicbased metrics are proposed. Then, the LDA topic model is adopted to generate topics and the corresponding parameters, and the prediction model is trained by both topic metrics as well as some traditional metrics. Experimental results show that the proposed method outperforms traditional defect prediction methods and can also ensure a stable model through the evolution of software, which means the new method can be efficiently used in defect prediction tasks in software engineering.

Key words: topic model;software defect prediction;software engineering