• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (09): 1700-1710.

• 人工智能与数据挖掘 • 上一篇    

基于负相关性增强的不平衡多标签学习算法

程玉胜1,2,曹天成1,王一宾1,郑伟杰1   

  1. (1.安徽省高校智能感知与计算重点实验室(安庆师范大学),安徽 安庆 246133;

    2.计算智能与信号处理教育部重点实验室(安徽大学),安徽 合肥 230061)

  • 收稿日期:2020-08-11 修回日期:2020-11-25 接受日期:2021-09-25 出版日期:2021-09-25 发布日期:2021-09-27
  • 基金资助:
    计算智能与信号处理教育部重点实验室(安徽大学)开放课题(2020A003);安徽省自然科学基金(2018085MF216)

An imbalanced multi-label learning algorithm based on negative correlation enhancement

CHENG Yu-sheng1,2,CAO Tian-cheng1,WANG Yi-bin1,ZHENG Wei-jie1#br# #br#   

  1. (1.University Key Laboratory of Intelligent Perception and 

    Computing of Anhui Province (Anqing Normal University),Anqing 246133;

    2.Key Laboratory of Intelligent Computing & Signal Processing,

    Ministry of Education (Anhui University),Hefei 230061,China)

  • Received:2020-08-11 Revised:2020-11-25 Accepted:2021-09-25 Online:2021-09-25 Published:2021-09-27

摘要: 由于标签空间过大,标签分布不平衡问题在多标签数据集中广泛存在,解决该问题在一定程度上可以提高多标签学习的分类性能。通过标签相关性提升分类性能是解决该问题的一种最常见的有效策略,众多学者进行了大量研究,然而这些研究更多地是采用基于正相关性策略提升性能。在实际问题中,除了正相关性外,标签的负相关性也可能存在,如果在考虑正相关性的同时,兼顾负相关性,无疑能够进一步改善分类器的性能。基于此,提出了一种基于负相关性增强的不平衡多标签学习算法——MLNCE,旨在解决多标签不平衡问题的同时,兼顾标签间的正负相关性,从而提高多标签分类器的分类性能。首先利用标签密度信息改造标签空间;然后在密度标签空间中探究标签真实的正反相关性信息,并添加到分类器目标函数中;最后利用加速梯度下降法求解输出权重以得到预测结果。在11个多标签标准数据集上与其他6种多标签学习算法进行对比实验,结果表明MLNCE算法可以有效提高分类精度。


关键词: 多标签学习, 多标签不平衡, 标签正负相关性, 标签密度, 加速梯度下降法

Abstract: Due to the high-dimensional label space, the imbalanced label distribution problem commonly exists in multi-label datasets. The classification performance of multi-label learning can be improved to some extent by taking care of this problem. Improving classification performance through label correlation is one of the most common and effective strategies. Many scholars have done a lot of re- searches, but most of these studies use positive correlation-based strategies to improve performance. In practice, besides positive correlation, negative correlation of labels may also exist. If both positive correlation and negative correlation are considered, the performance of the classifier will undoubtedly be further improved. Therefore, an imbalanced multi-label learning algorithm MLNCE is proposed based on the enhancement of negative correlation. It aims to alleviate the imbalance problem of multi-labels while considering the positive and negative correlation among labels and further improving the classification performance of multi-label classifiers. The algorithm first uses the label density information to transform the label space, then explores the true positive and negative correlation information among labels in the density label space, and adds it to the classifier objective function. Finally, the accelerated gradient descent is used to solve the output weights to obtain the prediction results. The proposed algorithm is compared with six other multi-label learning algorithms on 11 multi-label standard datasets, and the results show that the algorithm can effectively improve the classification accuracy.


Key words: