• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (11): 2018-2026.

• 人工智能与数据挖掘 • 上一篇    下一篇

融合标签信息的分层图注意力网络文本分类模型

杨春霞1,2,马文文1,2,徐奔1,2,韩煜1,2   

  1. (1.南京信息工程大学自动化学院,江苏 南京210044;2.江苏省大数据分析技术重点实验室,江苏 南京 210044)

  • 收稿日期:2021-12-17 修回日期:2022-07-19 接受日期:2023-11-25 出版日期:2023-11-25 发布日期:2023-11-16
  • 基金资助:
    国家自然科学基金(61273229)

A hierarchical graph attention network text classification model that integrates label information

YANG Chun-xia1,2,MA Wen-wen1,2,XU Ben1,2,HAN Yu1,2   

  1.  (1.School of Automation,Nanjing University of Information Science & Technology,Nanjing 210044;
    2.Jiangsu Key Laboratory of Big Data Analysis Technology(B-DAT),Nanjing 210044,China)
  • Received:2021-12-17 Revised:2022-07-19 Accepted:2023-11-25 Online:2023-11-25 Published:2023-11-16

摘要: 目前基于分层图注意力网络的单标签文本分类任务存在2方面不足:一是不能较好地对文本特征进行提取;二是很少有研究通过文本与标签之间的联系进一步凸显文本特征。针对这2个问题,提出一种融合标签信息的分层图注意力网络文本分类模型。该模型依据句子关键词与主题关联性构建邻接矩阵,然后使用词级图注意力网络获取句子的向量表示。该模型是以随机初始化的目标向量为基础,同时利用最大池化提取句子特定的目标向量,使得获取的句子向量具有更加明显的类别特征。在词级图注意力层之后使用句子级图注意力网络获取具有词权重信息的新文本表示,并通过池化层得到文本的特征信息。另一方面利用GloVe预训练词向量对所有文本标注的标签信息进行初始化向量表示,然后将其与文本的特征信息进行交互、融合,以减少原有特征损失,得到区别于不同文本的特征表示。在R52、R8、20NG、Ohsumed及MR 5个公开数据集上的实验结果表明,该模型的分类准确率明显优于其它主流基线模型的。

关键词: 分层图注意力网络, 单标签文本分类, 邻接矩阵, 标签信息

Abstract: Currently, there are two main limitations in single-label text classification tasks based on hierarchical graph attention networks. First, it cannot effectively extract text features. Second, there are few studies that highlight text features through the connection between text and labels. To address these two issues, a hierarchical graph attention network text classification model that integrates label information is proposed. The model constructs an adjacency matrix based on the relevance between sentence keywords and topics, and then uses word-level graph attention network to obtain vector representations of sentences. This method is based on randomly initialized target vectors and utilizes maximum pooling to extract specific target vectors for sentences, making the obtained sentence vectors have more obvious category features. After the word-level graph attention layer, a sentence-level graph attention network is used to obtain new text representations with word weight information, and pooling layers are used to obtain feature information for the text. On the other hand, GloVe pre-trained word vectors are used to initialize vector representations for all text labels, which are then interacted and fused with the feature information of the text to reduce the loss of original features, obtaining feature representations that are distinct from different texts. Experimental results on five public datasets (R52, R8, 20NG, Ohsumed, and MR) show that the classification accuracy of the model significantly exceeds other mainstream baseline models.

Key words: hierarchical graph attention network, single label text classification, adjacency matrix, label information ,