• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2006, Vol. 28 ›› Issue (12): 136-138.

• 论文 • 上一篇    下一篇

改进的基于义原同现频率的汉语词义排歧方法

刘亚清 于纯妍 张瑾   

  • 出版日期:2006-12-01 发布日期:2010-05-20

  • Online:2006-12-01 Published:2010-05-20

摘要:

针对传统的基于义原同现频率的汉语词义排歧方法存在“盲目性”的不足,本文根据《知网》中对概念定义的描述,分别计算多义词的每个义项与特征词的第一独立义原、其 他独立义原、关系义原、符号义原之间的相关系数;最后通过比较多义词的每个义项与特征词之间的相关系数来决定多义词的义项。经过实验验证,该方法进一步提高了词义排歧的效果。

关键词: 义原 相关系数 词义排歧

Abstract:

The fault of blindness exists in traditional Chinese word sense disambiguation methods which are based on primitive co-occurrence data. Based on the d epiction for the concept of Hownet, the authors calculate these relation- moduluses between the first unattached primitive, the other unattached primiti ve, the relation primitive, the sign primitive of each sense of a multivocal word and that of character-words. Finally, the sense of the muhivocal word is chosen by comparing these relation- moduluses between each sense of the muhivocal word and character-words. The experimental results show that the ac curacy of word sense disambiguation of the method is higher.

Key words: primitive, relation-modulus, word sense disambiguation