• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (12): 2372-2378.

• 论文 • 上一篇    下一篇

基于稀疏图表示的特征选择方法研究

王晓栋,严菲,谢勇,江慧琴   

  1. (厦门理工学院计算机与信息工程学院,福建 厦门 361024)
  • 收稿日期:2015-08-12 修回日期:2015-10-11 出版日期:2015-12-25 发布日期:2015-12-25
  • 基金资助:

    国家自然科学基金资助项目(61502405);福建省教育厅中青年教师教育科研资助项目(JA15385,JA15368);厦门理工学院对外科技合作专项资助项目(E201400400)

A feature selection method based on
sparse graph representation   

WANG Xiaodong,YAN Fei,XIE Yong,JIANG Huiqin   

  1. (College of Computer and Information Engineering,Xiamen University of Technology,Xiamen 316024,China)
  • Received:2015-08-12 Revised:2015-10-11 Online:2015-12-25 Published:2015-12-25

摘要:

特征选择旨在降低待处理数据的维度,剔除冗余特征,是机器学习领域的关键问题之一。现有的半监督特征选择方法一般借助图模型提取数据集的聚类结构,但其所提取的聚类结构缺乏清晰的边界,影响了特征选择的效果。为此,提出一种基于稀疏图表示的半监督特征选择方法,构建了聚类结构和特征选择的联合学习模型,采用l1范数约束图模型以得到清晰的聚类结构,并引入l2,1范数以避免噪声的干扰并提高特征选择的准确度。为了验证本方法的有效性,选择了目前流行的几种特征方法进行对比分析,实验结果表明了本方法的有效性。

关键词: 特征选择, 半监督学习, l2, 1范数, l1范数

Abstract:

Feature selection, which aims to reduce data’s dimensionality by removing redundant features, is one of the main issues in the field of machine learning. Most of existing graphbased semisupervised feature selection algorithms are suffering from neglecting clear cluster structure. We propose a semisupervised algorithm based on l1norm graph in this paper. A joint learning framework is built upon cluster structure and feature selection; l1-norm is imposed to guarantee the sparsity of the cluster structure, which is suitable for feature selection. To select the most relevant features and reduce the effect of outliers, the l2,1-norm regularization is added into the objective function. We evaluate the performance of the proposed algorithm over several data sets and compare the results with state-of-the-art semi-supervised feature selection algorithms. The results demonstrate the effectiveness of the proposed algorithm.

Key words: feature selection;semi-supervised learning;l2, 1-norm;l1-norm