• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (07): 1321-1330.

• Artificial Intelligence and Data Mining • Previous Articles    

A short text semantic matching strategy based on BERT sentence vector and differential attention

WANG Qin-chen1,DUAN Li-guo1,2,WANG Jun-shan3,ZHANG Hao-yan1,GAO Hao1   

  1. (1.College of Computer Science and Technology(College of Data Science),Taiyuan University of Technology,Taiyuan 030600;
    2 School of Information Technology Innovation,Shanxi University of Electronic Science and Technology,Linfen 041000;
    3.Network Security Corps of Beijing Municipal Public Security Bureau,Beijing 100740,China)
  • Received:2023-03-31 Revised:2023-06-21 Accepted:2024-07-25 Online:2024-07-25 Published:2024-07-19

Abstract: Short text semantic matching is a core issue in the field of natural language processing, which can be widely used in automatic question answering, search engines, and other fields. In the past, most of the work only considered the similar parts between texts, while ignoring the different parts between texts, making the model unable to fully utilize the key information to determine whether texts match. In response to the above issues, this paper proposes a short text semantic matching strategy based on BERT sentence vectors and differential attention. BERT is used to vectorize sentence pairs, BiLSTM is used, and a multi-header differential attention mechanism is introduced to obtain attention weights that represent intention differences between the current word vector and the global semantic information of the text. A one-dimensional convolutional neural network is used to reduce the dimension of the semantic feature vectors of the sentence pairs, Finally, the word sentence vector is spliced and sent to the full connection layer to calculate the semantic matching degree between the two sentences. Experiments on LCQMC and BQ datasets show that this strategy can effectively extract text semantic difference information, thereby enabling the model to display better results.

Key words: short text semantic matching, word sentence vector, represent intention, difference notice