• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (12): 2294-2299.

• 论文 • 上一篇    下一篇

现代汉语通感的自动抽取及映射方向性

刘洪超,Francesca Striklievers,黄居仁   

  1. (香港理工大学中文及双语学系,香港)
  • 收稿日期:2015-08-07 修回日期:2015-10-19 出版日期:2015-12-25 发布日期:2015-12-25
  • 基金资助:

    Word Chinese and Their Grammatical Variations:Empirical Studies based on Comparable Corpora (GRF project 543512)

Automatic extraction and mapping directionality of
synaesthetic sentences of modern Chinese  

LIU Hongchao,Francesca Striklievers,HUANG Churen   

  1. (CBS,The Hong Kong Polytechnic University,Hong Kong,China)
  • Received:2015-08-07 Revised:2015-10-19 Online:2015-12-25 Published:2015-12-25

摘要:

主要介绍现代汉语中通感(Synaesthesia)句子的自动抽取和感觉域之间的映射规律。通过构建各个感觉领域的词表和词性匹配的方式抽取语料库中的通感句子,采取了两种方法,一种是单纯的多领域感觉词匹配,准确率为2078%;第二种方法加入了词性匹配,准确率为4637%。主要难点在于五种感觉领域词表中词的选取和收集以及词性分布规则的总结上。最后统计了抽取句子通感源域到目标域的映射情况,检查了其映射方向是否与其他语言相同。

关键词: 现代汉语, 通感, 感觉词, 自动抽取

Abstract:

This paper focuses on the extraction and mapping tendencies of synaesthetic sentences in modern Chinese. The extraction applies two kinds of methodologies both based on the perception related word lists. We have constructed five sense word lists of touch, taste, smell, hearing and vision respectively. By checking each list and extracting the sentences with two or more kinds of perception related words, the accuracy of this methodology is 20.78%; by introducing POS distributing tendencies checking, the accuracy rises to 46.37%. The difficulty lies in collecting and further selecting the perception related word and also in observing the POS distributing rules of each perception related word. Finally, we check the mapping directionality of one domain of sense to another one.

Key words: modern Chinese;synaesthesia;perception related word;automatic extraction