• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (07): 1313-1320.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于多重相关信息交互的文本相似度计算方法

袁野,廖薇   

  1. (上海工程技术大学电子电气工程学院,上海 201620)
  • 收稿日期:2020-12-03 修回日期:2021-04-12 接受日期:2022-07-25 出版日期:2022-07-25 发布日期:2022-07-25
  • 基金资助:
    国家自然科学基金(62001282); “上海高校青年东方学者”岗位计划(QD2017043)

A text similarity calculation method based on multiple related information interaction

YUAN Ye,LIAO Wei   

  1. (School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)
  • Received:2020-12-03 Revised:2021-04-12 Accepted:2022-07-25 Online:2022-07-25 Published:2022-07-25

摘要: 文本相似度计算是自然语言处理的核心任务之一,传统的文本相似度计算方法只考虑文本的结构或者语义等单方面特征,缺少对文本多特征的深度分析,导致性能较低。提出一种基于多重相关信息交互的文本相似度计算方法,在文本嵌入矩阵中增加余弦相关性特征,使用自注意力机制考虑文本自身的相关性和词语依赖关系,进而使用交替协同注意力机制提取文本之间的语义交互信息,从不同角度获得更深层、更丰富的文本表征。实验结果表明,所提方法在2个数据集上的F1值分别为0.916 1和0.769 5,其性能优于基准方法的。

关键词: 文本相似度, 信息交互, 双向长短时记忆, 自注意力机制, 协同注意力机制

Abstract: Text similarity calculation is one of  core tasks in natural language processing. Traditional text similarity calculation methods only consider structured features or semantic of the text. The lack of in-depth analysis of multiple text  features leads to low performance. Therefore, a text similarity calcu- lation method based on multiple related information interaction is proposed, which adds cosine correlation characteristics to the text embedding matrix. The self-attention mechanism is used to consider the text relevance of itself and word dependence. Further, the alternate co-attention machanism is used to extract the se-mantic interaction information between texts, so deeper and richer text representations from different perspectives indicate obtained. The experimental results on two datasets show that the F1 values of the proposed method are 0.916 1 and 0.769 5 respectively, and indicate the proposed method outperforms the benchmark method.

Key words: text similarity, information interaction, bi-directional long and short-term memory, self-attention mechanism, co-attention mechanism