• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A  short text similarity calculation method based
on semantics and syntax structure

ZHAO Qian1,JING Qi1,LI Aiping1,2,DUAN Liguo1   

  1. (1.College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024;
    2.State Key Laboratory of Software Engineering,Wuhan University,Wuhan 430072,China)
  • Received:2016-12-12 Revised:2017-02-15 Online:2018-07-25 Published:2018-07-05

Abstract:

In order to improve the accuracy of short text semantic similarity calculation, we propose a new calculation method. Firstly the short text is segmented to sentence units and we conduct syntactic dependency analysis. Similarity calculation between sentences is based on the similarity calculation between words. We then propose to take the emotional characteristics of the words into consideration when calculating semantic similarity, and put forward a comprehensive method for word sense disambiguation. Based on the parts of words and the context, we leverage the Hownet semantic dictionary to do word semantic similarity calculation. The semantic similarity of sentences is obtained by the weighted average of the semantic similarity between words in a sentence according to sentence structures. Finally we calculate the semantic similarity of short texts through a new method called binary set . Experimental results show that the accuracy of word similarity and short text similarity reaches 87.63% and 93.77% respectively, which demonstrates the improvement in the accuracy of semantic similarity.
 

Key words: word sense disambiguation, emotional characteristic, syntactic dependency analysis, short text semantic similarity