• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (9): 160-165.

• 论文 • Previous Articles     Next Articles

Classification of Microblog SentimentBased on Nave Bayesian

LIN Jianghao1,YANG Aimin2,ZHOU Yongmei2,CHEN Jin3,CAI Zejian2   

  1. (1.School of Management,Guangdong University of Foreign Studies,Guangzhou 510006;
    2.Cisco School of Informatics,Guangdong University of Foreign Studies,Guangzhou 510006;
    3.School of English Language and Culture,
    Guangdong University of Foreign Studies,Guangzhou 510006,China)
  • Received:2012-04-13 Revised:2012-06-25 Online:2012-09-25 Published:2012-09-25

Abstract:

Based on the twice sentiment feature extraction approach,this paper uses syntactic dependency as the first extraction method and semantic lexicon as the second.A sentiment classifier based on nave Bayesian is constructed in order to classify the inclination of emotions from the collected hot topic data in Chinese microblog and hotel remarks.The experiments mainly compare the classification performance of different combination groups including emoticons,punctuation, extraction methods based on semantic lexicon feature and those based on twice sentiment feature to find out better pretreatment methods for sentiment classification of microblog text. Besides,the experiments also compare and analyze the sentiment classification results between microblog text and hotel remarks to seek out the reasons for influencing the classification performance of microblog sentiment.The results indicate that the twice sentiment feature extraction gain the higher F1.And the performance of “emoticons + punctuation + twice sentiment feature extraction + BOOL” is the best pretreatment method.Meanwhile,it also shows the reason why the classifier based on nave Bayesian obtains higher classification performance in hotel remarks is probably that the topic in microblog is various.

Key words: microblog;text sentiment classification;twice sentiment feature extraction;nave Bayesian