• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

一种基于混合特征的中文情感词典扩展方法

谢松县1,赵舒怡 2   

  1. (1.国防科学技术大学计算机学院,湖南 长沙 410073;2.国家电网技术学院,山东 泰安 271000)
  • 收稿日期:2015-05-05 修回日期:2016-08-11 出版日期:2016-07-25 发布日期:2016-07-25

A Chinese sentiment lexicon extension method based on mixing features        

XIE Song-xian1,ZHAO Shu-yi2   

  1. (1.College of Computer,National University of Defense Technology,Changsha 410073;
    2.State Grid of China Technology College,Tai’an 271000 China)
  • Received:2015-05-05 Revised:2016-08-11 Online:2016-07-25 Published:2016-07-25

摘要:

覆盖面广且领域适应性好的情感词典可以有效提高文本情感分析效能。设计了基于连词语言特征和词性特征向量统计特征的中文情感词典扩展算法,提出了综合两种方法的混合特征算法。算法计算得到词语的细粒度的积极和消极情感极性值,并对通用情感词典在领域内进行扩展以提高覆盖度,对词典进行领域内调整以提高适应性。实验结果表明,算法在领域内扩展获得的词典比通用情感词典覆盖度和适应性更好,在情感分类任务中性能接近有监督方法。

Abstract:

The performance of sentiment analysis can be improved effectively with the help of a wide-coverage and good domain-adapting sentiment lexicon. We firstly design two Chinese sentiment lexicon extension algorithms, which  base on conjunctions feature and POS-vector statistical feature respectively. We then propose an integrated mixing feature method that combines the two algorithms. Fine-grained positive and negative values can be calculated for opinion words, the coverage of the lexicon can be improved within a domain, and the adaption of the lexicon can be improved with adjustment in the domain. Experimental results show that the extension lexicon has wider coverage and better adaption than a general lexicon in a domain, and the proposal's performance of sentiment classification can approximate that of a supervised method.