• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (01): 93-98.

• 论文 • 上一篇    下一篇

基于粗糙集支持向量机的软件缺陷预测

孟倩1,2,马小平1   

  1. (1.中国矿业大学信电学院,江苏 徐州 221008;2.江苏师范大学计算科学与技术学院,江苏 徐州 221116)
  • 收稿日期:2014-08-12 修回日期:2014-10-19 出版日期:2015-01-25 发布日期:2015-01-25
  • 基金资助:

    国家自然科学基金资助项目(61303183);江苏省自然科学基金资助项目(BK20130204);高等学校博士学科点专项科研基金资助项目(20120095120023)

Software defect prediction using
rough sets and support vector machine  

MENG Qian1,2,MA Xiaoping1   

  1. (1.School of Information and Electrical Engineering,China University of Mining and Technology,Xuzhou 221008;
    2. College of Computer Science and Technology,Jiangsu Normal University,Xuzhou 221116,China)
  • Received:2014-08-12 Revised:2014-10-19 Online:2015-01-25 Published:2015-01-25

摘要:

软件缺陷预测已成为软件工程的重要研究课题,构造了一个基于粗糙集和支持向量机的软件缺陷预测模型。该模型通过粗糙集对原样本集进行属性约减,去掉冗余的和与缺陷预测无关的属性,利用粒子群对支持向量机的参数做选择。实验数据来源于NASA公共数据集,通过属性约减,特征属性由21个约减为5个。实验表明,属性约减后,Bayes分类器、CART树、神经网络和本文提出的粗糙集—支持向量机模型的预测性能均有所提高,本文提出的粗糙集支持向量机的预测性能好于其他三个模型。

关键词: 粗糙集, 支持向量机, 软件缺陷, 预测, 粒子群

Abstract:

The prediction of software defects has been an important research topic in the field of software engineering. The paper focuses on the problem of defect prediction. A classification model for predicting software defects based on the integration of rough sets and support vector machine model (RS-SVM) is constructed. Rough sets work as a preprocessor in order to remove redundant information and reduce data dimensionality before the sample data are processed by support vector machine. As a solution to the difficulty of choosing parameters, the particle swarm optimization algorithm is used to choose the parameters of support vector machines. The experimental data are from the open source NASA datasets. The dimensions of the original data sets are reduced from 21 to 5 by rough sets. Experimental results indicate that the prediction performances of Bayes classifier, CART tree, RBF neural network and RS-SVM are all improved after the dimension of the original data sets are reduced from 21 to 5 by rough sets. Compared with the above three models, RS-SVM has a higher prediction performance.

Key words: rough sets;support vector machine;software defect;prediction;particle swarm optimization