• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (03): 577-584.

• 论文 • 上一篇    下一篇

基于统计数据的微博表情符分析及其在情绪分析中的应用

刘宝芹,牛耘,张景   

  1. (南京航空航天大学计算机科学与技术学院,江苏 南京 210016)
  • 收稿日期:2015-05-25 修回日期:2015-07-01 出版日期:2016-03-25 发布日期:2016-03-25
  • 基金资助:

    国家自然科学基金(61202132)

Statistical analysis of emoticons and
its  application in emotion analysis  

LIU Baoqin,NIU Yun,ZHANG Jing   

  1. (School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)
  • Received:2015-05-25 Revised:2015-07-01 Online:2016-03-25 Published:2016-03-25

摘要:

表情符作为一种新兴的网络语言,受到了越来越多的微博用户的青睐。微博中出现的表情符形象直观地表达了博主的情绪,对情绪分析起着至关重要的作用。首先对大量中文微博中表情符的使用特点、分布情况和情绪表达特点进行了统计分析。然后,人工选取具有代表性且情感倾向明确的表情符作为六类基本情绪的种子表情符。根据目标表情符和六类情绪的种子表情符在微博文本中的共现情况,为其建立六维情绪向量,并将其应用于微博情绪分析。在两个数据集上的实验结果表明,本文建立的表情符情绪向量有效地提高了微博情绪识别的精度。

关键词: 表情符, 情绪向量, 统计分析, 情绪分析

Abstract:

As a new network language, emoticons have earned the favor of an increasing number of Microblog users. Emoticons in microblogs vividly represent blogger’s emotions. We first make a comprehensive analysis of emoticons in a large corpus of Chinese microblogs, including their usage, distribution and characteristics in emotion expression. Secondly, we manually select a list of emoticons that typically indicate six basic emotions as seeds. Based on the cooccurrence between a target emoticon and the seed emoticons in a large corpus, we establish sixdimensioned vectors for the target emoticon and apply them to emotion analysis. Experimental results on two data sets show that the emoticon vectors can effectively improve the precision of microblog emotion recognition.

Key words: emoticon;emotion vectors;statistical analysis;emotion analysis