• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (09): 1667-1674.

• 人工智能与数据挖掘 • 上一篇    下一篇

一种基于超图的多模态多标签分类方法

陆斌1,2,范强2,3,周晓磊2,3,严浩2,3,王芳潇2,3   

  1. (1.南京信息工程大学软件学院,江苏 南京 210044;2.国防科技大学第六十三研究所,江苏 南京 210007;
    3.国防科技大学大数据与决策实验室,湖南 长沙 410073)

  • 收稿日期:2023-09-01 修回日期:2023-10-17 接受日期:2024-09-25 出版日期:2024-09-25 发布日期:2024-09-23

A multimodal multi-label classification method based on hypergraph

LU Bin1,2,FAN Qiang2,3,ZHOU Xiao-lei2,3,YAN Hao 2,3,WANG Fang-xiao2,3   

  1. (1.School of Software,Nanjing University of Information Science and Technology,Nanjing 210044;
    2.The Sixty-third Research Institute,National University of Defense Technology,Nanjing 210007;
    3.Laboratory for Big Data and Decision,National University of Defense Technology,Changsha 410073,China)
  • Received:2023-09-01 Revised:2023-10-17 Accepted:2024-09-25 Online:2024-09-25 Published:2024-09-23

摘要: 标签分类旨在从若干标签中选取最相关的标签子集来标注一个实例,现已成为人工智能领域的热点问题。传统的多标签学习方法主要针对单一模态数据进行识别,针对多模态数据之间的高阶关联挖掘研究较少。为解决多标签场景下多模态数据之间高阶关联表示不充分的问题,提出了一种基于超图的多模态多标签分类方法。引入超图模型对多模态数据的高阶关联进行建模,利用多模态特征融合和超边卷积操作,实现多模态数据关系挖掘和特征识别,提高了多模态多标签分类的性能。采用电影体裁分类任务进行实验,并与传统方法进行了比较。实验结果表明,所提出的方法在准确率、精度、F1值上优于对比方法,证明了该方法的有效性。

关键词: 多标签学习, 数据关联, 超图, 多模态

Abstract: Label classification aims to select the most relevant subset of labels from a set of labels to tag an instance, which has become a hot issue in the field of artificial intelligence. Traditional multi- label learning methods mainly focus on identifying single-modal data, with limited research on mining high-order  correlation between multi-modal data. To address the issue of insufficient representation of high-order correlations between multi-modal data in multi-label scenarios, this paper proposed a multi-modal multi-label classification method based on hypergraphs. The hypergraph model is introduced to model the high-order correlations of multi-modal data, and the fusion of multi-modal features and hyperedge convolution operation are utilized to achieve the mining of multi-modal data relationships and feature recognition, thus improving the performance of multi-modal multi-label classification. Experiments were conducted on the movie genre classification task, and the proposed method was compared with traditional methods. The experimental results show that the proposed method outperforms the comparison methods in terms of accuracy, precision, and F1 score, demonstrating the effectiveness of the method.

Key words: multi-label learning, data correlation, hypergraph, multi-modal