• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

融合标记独有属性特征的k近邻多标记分类新算法

蒋芸,肖潇,侯金泉,陈莉   

  1. (西北师范大学计算机科学与工程学院,甘肃 兰州 730070)
  • 收稿日期:2017-12-25 修回日期:2018-01-24 出版日期:2019-03-25 发布日期:2019-03-25
  • 基金资助:

    国家自然科学基金(61163036);甘肃省科技计划资助自然科学基金(1606RJZA047);2012年度甘肃省高校基本科研业务费专项资金;甘肃省高校研究生导师项目(120116);西北师范大学第三期知识与创新工程科研骨干项目(nwnukjcxgc0367)

A new kNN multi-label classification
algorithm with label-specific features

JIANG Yun,XIAO Xiao,HOU Jinquan,CHEN Li   

  1. (College of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2017-12-25 Revised:2018-01-24 Online:2019-03-25 Published:2019-03-25

摘要:

在多标记学习系统中,每个样本同时与多个类别标记相关,却均由一个属性特征向量描述。大部分已有的多标记分类算法采用的共同策略是使用相同的属性特征集合预测所有的类别标记,但它并非最佳选择,原因在于每个标记可能与其自身独有的属性特征相关性最大。针对这一问题,提出了融合标记独有属性特征的k近邻多标记分类算法—IMLkNN。首先对多标记数据的特征向量进行预处理,分别为每类标记构造对该类标记最具有判别能力的属性特征;然后基于得到的属性特征使用改进后的MLkNN算法进行分类。实验结果表明,IMLkNN算法在yeast和image数据集上的性能明显优于MLkNN算法以及其他3种常用的多标记分类算法。
 

关键词: 多标记学习, 多标记k近邻, 标记独有特征, 标记相关性

Abstract:

In a multi-label learning system, each sample is associated with multiple class labels at the same time but described by only one feature vector. The common strategy adopted by most existing multi-label classification algorithms is to predict all class labels using the same set of features. It is not the best choice as each label is probably most relevant to its own characteristics. To solve this problem, we propose an improved kNN multi-label classification algorithm with lablespecific features, named IML-kNN. Firstly, the IML-kNN preprocesses the feature vectors of multi-label data and constructs the most discriminative feature for each class of labels. Then, the IML-kNN algorithm is used to do classification based on the obtained characteristics. Experimental results show that the IMLkNN algorithm is obviously superior to the ML-kNN algorithm and other three commonly used multi-label classification algorithms on the yeast and image data sets.
 

Key words: multi-label learning, ML-kNN, label specific feature, label correlation