• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles    

Short text representation learning based
on semantic feature space context

TUO Ting1,MA Huifang1,2,WEI Jiahui1,LIU Haijiao1   

  1. (1.College of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070;
    2.Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 514004,China)
     
  • Received:2017-10-24 Revised:2018-04-11 Online:2019-02-25 Published:2019-02-25

Abstract:

Text representation is a basic task in natural language processing. Aiming at the drawback of the traditional highdimensional sparse representation of short text, we propose a short text representation learning method based on semantic feature space context, called SFCR. Given the high dimension of the initial feature space, we firstly calculate the mutual information and cooccurrence relationship between terms, based on which we obtain the initial similarity and perform semantic clustering of terms. And the semantic feature space after dimensionality reduction can then be represented via the cluster center. Secondly, by combining the context information of the terms on the cluster formed after clustering, three similarity calculation methods are designed to calculate the similarity between the terms of the short text to be represented and the feature terms in the feature space. Thereafter the text mapping matrix for short text representation learning is constructed. Experimental results show that the proposed method can well reflect the semantic information of short text, and make reasonable and effective representation learning of short text.
 

Key words: semantic feature space, similarity calculation, text mapping matrix, short text representation