• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (05): 945-950.

• 人工智能与数据挖掘 • 上一篇    

基于实体知识的远程监督关系抽取

马长林,孙状   

  1. (华中师范大学计算机学院,湖北 武汉 430079) 
  • 收稿日期:2023-10-26 修回日期:2023-11-21 接受日期:2024-05-25 出版日期:2024-05-25 发布日期:2024-05-30
  • 基金资助:
    国家自然科学基金 (62272189)

Distantly supervised relation extraction based on entity knowledge

MA Chang-lin,SUN Zhuang   

  1. (School of Computer Science,Central China Normal University,Wuhan 430079,China) 
  • Received:2023-10-26 Revised:2023-11-21 Accepted:2024-05-25 Online:2024-05-25 Published:2024-05-30

摘要: 为了降低远程监督关系抽取标记数据的噪声,提出一种融合实体描述和自注意力机制的远程监督关系提取模型,模型基于多示例学习,考虑到实体知识和位置关系的综合作用,采用词、实体、实体描述和相对位置的拼接向量作为模型输入,将分段卷积神经网络作为句子编码器,结合改进的结构化自注意力机制,捕捉特征内部相关性,并构造头实体和尾实体的差向量作为注意力机制的监督信息,为句子分配权重。在纽约时报数据集上的实验结果表明,与已有模型相比,本文模型的性能指标均达到最大值。

关键词: 关系抽取, 实体, 实体描述, 分段卷积神经网络, 自注意力机制

Abstract: To reduce the noise of labeled data in the distantly supervised relationship extraction, a distant supervision relationship extraction model integrating entity description and self-attention mechanism is proposed. Based on multi-instance learning, the comprehensive impacts of entity knowledge and position relation are considered, and the splicing vector of word, entity, entity description and relative position are adopted as the model input. A piecewise convolutional neural network is employed as the sentence encoder, which combines with the improved structured self-attention mechanism to capture the internal correlation of features. The difference vector between tail entity and head entity is constructed as the supervision information of attention mechanism to assign weight to each sentence. Experimental results on New York Times dataset show that the model performance indexes of the model reach the maximum values when compared to state-of-the-art models. 

Key words: relation extraction, entity, entity description, piecewise convolutional neural network, self-attention mechanism