• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (01): 149-158.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于层次体系的情感单元表示方法

张宝华1 ,李奀林2 ,张华平1 ,商建云1   

  1. (1.北京理工大学计算机学院,北京 100081;2.军委训练管理部,北京 100142)
  • 收稿日期:2020-10-05 修回日期:2020-11-04 接受日期:2022-01-25 出版日期:2022-01-25 发布日期:2022-01-13
  • 基金资助:
    国家重点研发计划(2018YFC0832304)

A sentiment unit representation method based on layer hierarchy

ZHANG Bao-hua1,LI En-lin2,ZHANG Hua-ping1,SHANG Jian-yun1   

  1. (1.School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081;

    2.Training Management Department of the Central Military Commission,Beijing 100142,China)
  • Received:2020-10-05 Revised:2020-11-04 Accepted:2022-01-25 Online:2022-01-25 Published:2022-01-13

摘要: 情感词是情感分析中的基础单元,因此情感词典在情感分析中起着决定性的作用,目前构建情感词典的方法只是用到了单词的语义信息和构词信息,忽略了其所在语境。基于此,对于一些语义未知的词,传统语义方法难以得出其情感权重,而对于一些由于语境变化而产生新用法的词,使用语义方法很难计算出其真实权重。针对这种情况,首先提出了从构字到篇章的情感分析层次体系,每层都有对应到上层的表示方法和情感值计算公式,将分析单元细分到单词维度。在此基础上,提出了基于词语构字和语境的情感语义单元自动构建方法。该方法利用已知情感词典,同时根据情感词的构字和情感词的语境情感倾向计算该词的情感权重,得到的结果更加准确。在社交网络真实数据集上的实验表明,本文方法构建的情感单元较之前的方法在准确率上有3%的提升。同时,情感单元可直接用到情感分析任务中,情感分析的准确率在基于规则的情感分析实验中有9%的提升,在深度学习方法上有3%的提升。


关键词: 情感分析, 情感层次体系, 情感单元, 构词, 语境

Abstract: Sentiment word is the basic unit in the task of sentiment analysis, so sentiment lexicon plays an important role in sentiment analysis. Currently, the sentiment lexicon building methods only use word formation and semantic information, but ignore the context. Based on this, for some words with unknown semantics, it is difficult for traditional semantic methods to obtain the semantic weight, and for some words that have new usage due to context changes, it is difficult to calculate their true weight using semantic methods. To rectify the problem, a sentiment analysis hierarchy from the word to chapter is proposed. Each layer has a representation method and sentiment value calculation formula corresponding to the upper layer, which subdivides the analysis unit from sentence dimensions into word dimensions. Based on this, this paper proposes an automatic construction method for sentiment lexicon based on the character and the context of sentiment word. This method can calculate the weight of sentiment word by using the public sentiment lexicon, the word formation of sentiment word, and the contextual sentiment tendency of sentiment word, and the obtained result is more accurate. Experiments on the real dataset of social networks show that the sentiment unit constructed in this paper has a 3% improvement in accuracy compared with the previous methods. At the same time, the sentiment unit can be used directly in sentiment analysis tasks and the accuracy of sentiment analysis has a 9% improvement in rule-based sentiment analysis experiments and a 3% improvement in deep learning methods.


Key words: sentiment analysis, sentiment hierarchy, sentiment unit, character, context ,