• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Research on the application of
topic  model in short text

HAN Xiao-yun,HOU Zai-en,SUN Mian   

  1. (School of Arts and Science,Shaanxi University of Science and Technology,Xi’an 710021,China)

     
  • Received:2019-03-19 Revised:2019-06-13 Online:2020-01-25 Published:2020-01-25

Abstract:

The paper aims at the problem that traditional LDA-based topic models on short texts are susceptible to sparseness, noise, and redundancy. Firstly, the changes of text feature representation and the development of topic models on short texts are reviewed. The generation process of the Latent Dirichlet Allocation (LDA) model and the Dirichlet Multinomial Mixture (DMM) model and the corresponding Gibbs sampling parameter derivation are systematically summarized. Regarding the optimal number of topics in the topic model, a detailed comparison of the four common optimization indicators is given. Finally, the extended research of the topic model in the past two years and its simple application in network public opinion are analyzed, and the research direction and focus of the future topic model are pointed out.

 

 

 

Key words: latent dirichlet allocation (LDA) model, dirichlet multinomial mixture (DMM) model, short text, topic model, internet public opinion, Gibbs sampling