• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (03): 512-519.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

A Chinese sentiment analysis model combining character and word information

YANG Chun-xia1,2,3,YAO Si-cheng1,2,3,SONG Jin-jian1,2,3    

  1. (1.School of Automation,Nanjing University of Information Science & Technology,Nanjing 210044;
    2.Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing 210044;
    3.Jiangsu Collaborative Innovation Center of Atmospheric Environment 
    and Equipment Technology,Nanjing 210044,China)
  • Received:2021-07-05 Revised:2021-11-21 Accepted:2023-03-25 Online:2023-03-25 Published:2023-03-23

Abstract: Chinese sentiment analysis models usually only use word granularity information as text representation, which will cause that the model loses the characteristics of word granularity during feature extraction. At the same time, the commonly used word segmentation models are too concise in word segmentation results, which limits the richness of text representation to a certain extent. In this regard, a Chinese sentiment analysis model that combines character granularity features and word granularity features is proposed. The full pattern word segmentation is used to obtain a richer word sequence. After word embedding, the word vector is input into Bi-LSTM to extract the semantic information of the full text. The hidden semantic representation and the corresponding word vector are initially fused to enhance the robustness of word-level information. On the other hand, the word vector is input into multi-window convolution to capture more fine-grained word-level feature information. Finally, the word granularity features are further fused and input into the classifier to obtain the sentiment classification results. The performance test results on two public data sets show that this model improves the classification performance compared with similar models.

Key words: Chinese sentiment analysis, full pattern word segmentation, multi-granularity fusion