• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (12): 2352-2357.

• 论文 • 上一篇    下一篇

基于答案辅助的半监督问题分类方法

张栋,李寿山,周国栋   

  1. (苏州大学计算机科学与技术学院,江苏 苏州 215006)
  • 收稿日期:2015-08-15 修回日期:2015-10-24 出版日期:2015-12-25 发布日期:2015-12-25
  • 基金资助:

    国家自然科学基金重点项目(61331011);国家自然科学基金资助项目(61375073,61273320)

A classification method for semi-supervised question classification with answers   

ZHANG Dong,LI Shoushan,ZHOU Guodong   

  1. (School of Computer Science & Technology,Soochow University,Suzhou 215006,China)
  • Received:2015-08-15 Revised:2015-10-24 Online:2015-12-25 Published:2015-12-25

摘要:

问题分类旨在对问题的类型进行自动分类,该任务是问答系统研究的一项基本任务。提出了一种基于答案辅助的半监督问题分类方法。首先,将答案特征结合问题特征一起实现样本表示;然后,利用标签传播方法对已标注问题训练分类器,自动标注未标注问题的类别;最后,将初始标注的问题和自动标注的问题合并作为训练样本,利用最大熵模型对问题的测试文本进行分类。实验结果表明,本文提出的基于答案辅助的半监督分类方法能够充分利用未标注样本提升性能,明显优于其他的基准方法。

关键词: 问答系统, 问题分类, 答案辅助, 半监督分类, 标签传播

Abstract:

Question classification aims at classifying the types of questions automatically, and this is a basic task of the question answering system. We propose a classification method for semi-supervised questions with answers. Firstly, we combine answer features with question features to realize sample expressions. Then we train a question classifier on labeled questions using label propagation algorithm to annotate the category of unlabeled questions automatically. The questions of initial annotation and automatic annotation are merged with each other as training samples, and the maximum entropy model is adopted to classify the testing samples. Experimental results demonstrate that the classification method for semisupervised questions with answers in this paper can make full use of the unlabeled samples to improve the performance, and it outperforms other benchmark methods.

Key words: question answering system;question classification;answer aiding;semi-supervised classification;label propagation