• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于核稀疏表示的属性选择算法

吕治政,李扬定,雷聪   

  1. (广西师范大学计算机科学与信息工程学院,广西 桂林 541004)
  • 收稿日期:2018-10-29 修回日期:2019-07-12 出版日期:2020-01-25 发布日期:2020-01-25
  • 基金资助:

    国家重点研发计划(2016YFB1000905);国家自然科学基金(6117013120);国家973项目(2013CB329404);中国博士后科学基金(2015M570837);广西自然科学基金(2015GXNSFCB139011)

A feature selection algorithm based
on kernel sparse representation

Lv Zhi-zheng,LI Yang-ding,LEI Cong   

  1. (College of Computer Science and Information Engineering,Guangxi Normal University,Guilin 541004,China)
  • Received:2018-10-29 Revised:2019-07-12 Online:2020-01-25 Published:2020-01-25

摘要:

为解决高维数据在分类时造成的“维数灾难”问题,提出一种新的将核函数与稀疏学习相结合的属性选择算法。具体地,首先将每一维属性利用核函数映射到核空间,在此高维核空间上执行线性属性选择,从而实现低维空间上的非线性属性选择;其次,对映射到核空间上的属性进行稀疏重构,得到原始数据集的一种稀疏表达方式;接着利用L1范数构建属性评分选择机制,选出最优属性子集;最后,将属性选择后的数据用于分类实验。在公开数据集上的实验结果表明,该算法能够较好地实现属性选择,与对比算法相比分类准确率提高了约3%。
 

关键词: 属性选择, 非线性, 核函数, 稀疏学习, L1范数

Abstract:

In order to solve the “dimension disaster” problem caused by high-dimensional data classification, the paper proposes a new feature selection algorithm combining kernel function with sparse learning. Specifically, the kernel function is firstly used to map every dimensional feature to the kernel space, and linear feature selection is performed in the high dimensional kernel space to achieve nonlinear feature selection in the low dimensional space. Secondly, sparse reconstruction is performed on the features mapped to the kernel space, so as to gain a sparse representation of the original dataset. Next, L1-norm is used to construct a feature selection mechanism and selects the optimal feature subset. Finally, the data after the feature selection is used in the classification experiments. Experimental results on public datasets show that, compared with the comparison algorithm, the proposed algorithm can conduct the feature selection better and improve the classification accuracy by about 3%.
 

Key words: feature selection, nonlinear, kernel function, sparse learning, L1-norm