• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (1): 143-149.doi: 10.3969/j.issn.1007130X.2011.

• 论文 • Previous Articles     Next Articles

A Study of the Question Classification Task in CommunityBased Q&A Services

WANG Junze,HUANG Benxiong,HU Guang,WEN Jie   

  1. (Department of Electronics and Information Engineering,
    Huazhong University of Science and Technology,Wuhan 430074,China)
  • Received:2009-12-21 Revised:2010-04-17 Online:2011-01-25 Published:2011-01-25

Abstract:

In Communitybased Q&A services(referred to as cQA) such as Baidu Zhidao, question classification is one of the crucial tasks and it is important to organize the questions submitted to the cQA system. The question categorization algorithm for the cQA service needs to get high accuracy, low computation and lowsensitivity to noise. Based on the kullbackLeibler distance classification algorithm, this paper introduces a new question classification approach adopting the idea of language model, named ngram KLD. The experimental results with a large corpus which contains more than 1 million questionanswer pairs show a significant improvement when the ngram KLD algorithm is used. And the ngram KLD algorithm is fit for the actual demand of the question classification task in the cQA service.

Key words: short text classification;KullbackLeibler Distance;language model