• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (5): 146-150.

• 论文 • 上一篇    下一篇

规则和统计相结合的情感分析研究

昝红英1,左维松1,张坤丽1,吴云芳2   

  1. (1.郑州大学信息工程学院,河南 郑州 450052;2.北京大学计算语言学研究所,北京 100871)
  • 收稿日期:2010-03-21 修回日期:2010-09-28 出版日期:2011-05-25 发布日期:2011-05-25
  • 作者简介:昝红英(1966),女,河南焦作人,博士,副教授,研究方向为自然语言处理和文本挖掘。左维松(1982),男,河南信阳人,硕士,研究方向为文本情感倾向性分析。张坤丽(1977),女,河南巩义人,硕士,讲师,研究方向为自然语言处理。
  • 基金资助:

    国家863计划资助项目(2007AA01Z198);国家自然科学基金资助项目(60970083);国家社会科学基金资助项目(08CYY016)

Sentiment Analysis Based on Rules and Statistics

ZAN Hongying1,ZUO Weisong1,ZHANG Kunli1,WU Yunfang2   

  1. (1.School of Information and Engineering,Zhengzhou University,Zhengzhou 450052;
    2.Institute of Computational Linguistics,Peking University,Beijing 100871,China)
  • Received:2010-03-21 Revised:2010-09-28 Online:2011-05-25 Published:2011-05-25

摘要:

基于递归分治策略基本思想,本文构建了一种新的情感分析模型并解释了模型的合理性。本文首先分析了资源和统计方法的优缺点。资源的情感倾向性分析优点在于情感词表准确,缺点是完备性较差;而统计的方法则恰恰相反。进而提出了规则和统计相结合的方法分析文本的情感倾向性,并将规则和统计相结合的情感分析方法应用于该模型,并验证了其有效性。实验表明,在数据不均衡的条件下,该方法的正确率达到了77.68%。

关键词: 中文信息处理, 情感分类, 搭配规则, 判定表

Abstract:

In this paper, we propose a new model of sentiment analysis which is based on the recursive and divided function, and explain the rationality of the model. The paper analyzes the advantages and disadvantages of sentiment analysis. The advantages of the resourcebased approach are that the emotional vocabulary is accurate. The shortcomings of this method is that the soundness is poor. But the statistical methods are opposite. Additionally, the paper provides a new way to analyse the sentiment of texts,and verifies the effectiveness. The method attains an accuracy of 77.68% on the test, although the data is imbalancing.

Key words: Chinese information processing;sentiment classification;collocation rules;decision list