• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (07): 1321-1330.

• 人工智能与数据挖掘 • 上一篇    

基于BERT字句向量与差异注意力的短文本语义匹配策略

王钦晨1,段利国1,2,王君山3,张昊妍1,郜浩1


  

  1. (1.太原理工大学计算机科学与数据学院(大数据学院),山西 太原 030600;
    2.山西电子科技学院信创产业学院,山西 临汾 041000;3.北京市公安局网络安全保卫总队,北京 100740)
  • 收稿日期:2023-03-31 修回日期:2023-06-21 接受日期:2024-07-25 出版日期:2024-07-25 发布日期:2024-07-19
  • 基金资助:
    山西省自然科学基金(202203021221234)

A short text semantic matching strategy based on BERT sentence vector and differential attention

WANG Qin-chen1,DUAN Li-guo1,2,WANG Jun-shan3,ZHANG Hao-yan1,GAO Hao1   

  1. (1.College of Computer Science and Technology(College of Data Science),Taiyuan University of Technology,Taiyuan 030600;
    2 School of Information Technology Innovation,Shanxi University of Electronic Science and Technology,Linfen 041000;
    3.Network Security Corps of Beijing Municipal Public Security Bureau,Beijing 100740,China)
  • Received:2023-03-31 Revised:2023-06-21 Accepted:2024-07-25 Online:2024-07-25 Published:2024-07-19

摘要: 短文本语义匹配是自然语言处理领域中的一个核心问题,可广泛应用于自动问答、搜索引擎等领域。过去的工作大多只考虑文本之间的相似部分,忽略了文本之间的差异部分,从而使模型无法充分利用到决定文本之间是否匹配的关键信息。针对上述问题,提出一种基于BERT字句向量与差异注意力的短文本语义匹配策略,利用BERT对句子对进行向量化表示,使用BiLSTM并引入多头差异注意力机制获取当前字向量与文本全局语义信息之间表征意图差异的注意力权重,结合一维卷积神经网络对句子对的语义特征向量进行降维,最后拼接字句向量并送入全连接层计算出2个句子之间的语义匹配度。通过在LCQMC和BQ Corpus数据集上的实验表明,该策略可以有效提取文本语义差异信息,从而使模型表现出更好的效果。

关键词: 短文本语义匹配, 字句向量, 表征意图, 差异注意

Abstract: Short text semantic matching is a core issue in the field of natural language processing, which can be widely used in automatic question answering, search engines, and other fields. In the past, most of the work only considered the similar parts between texts, while ignoring the different parts between texts, making the model unable to fully utilize the key information to determine whether texts match. In response to the above issues, this paper proposes a short text semantic matching strategy based on BERT sentence vectors and differential attention. BERT is used to vectorize sentence pairs, BiLSTM is used, and a multi-header differential attention mechanism is introduced to obtain attention weights that represent intention differences between the current word vector and the global semantic information of the text. A one-dimensional convolutional neural network is used to reduce the dimension of the semantic feature vectors of the sentence pairs, Finally, the word sentence vector is spliced and sent to the full connection layer to calculate the semantic matching degree between the two sentences. Experiments on LCQMC and BQ datasets show that this strategy can effectively extract text semantic difference information, thereby enabling the model to display better results.

Key words: short text semantic matching, word sentence vector, represent intention, difference notice