• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (9): 188-192.

• 论文 • Previous Articles     Next Articles

Research on Feature Selection Metric for Predicate Identification

ZHANG Yihao,JIN Peng   

  1. (1.School of Computer Science,Leshan Teachers’ College,Leshan 614004;(2.Laboratory of Intelligent Information Processing and Application Institutional,Leshan Teachers’ College,Leshan 614004,China)
  • Received:2012-04-12 Revised:2012-06-16 Online:2012-09-25 Published:2012-09-25

Abstract:

Predicate Identification is one of the important research topics in shallow parsing.In this paper, a predicate identification method is proposed based on the support vector machine classification algorithm.Our focus is on the feature selection method with information gain and the metric method of feature words with TongYiCiCiLin information gain method selects the features that have a greater impact to classification model,which can reduce the dimensions of feature vector.TongYiCiCiLin maps the feature words into deepseated semantic concept,enhances the representation ability of features, and emphasizes the degree of correlation between the features and the model.Experiments on a relatively small corpus show that the best FScore of predicate identification reaches 84.0% and increases by 4.6% compared with the situation without dealing with the data.The experimental results show that the new method of the selection method of feature words and the representation of feature attribute are effective for predicate identification and can greatly improve the performance of classification.

Key words: predicate identification;feature selection;TongYiCiCiLin;information gain;support vector machine