• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (02): 340-346.

• 论文 • Previous Articles     Next Articles

Text categorization based on resource allocating network and semantic feature selection        

HE Xiaoliang1,2,SONG Wei1,LIANG Jiuzhen1   

  1. 1.School of IoT Engineering,Jiangnan University,Wuxi 214122;
    2.Traffic Management Research Institute,Ministry of Public Security,Wuxi 214151,China)
  • Received:2012-08-13 Revised:2012-10-08 Online:2014-02-25 Published:2014-02-25

Abstract:

Confronted with the existence of hidden nodes affected by the initial learning data and the low convergence rate of RAN learning algorithm, a new Resource Allocating Network (RAN) learning algorithm is proposed. The initial hidden layer node, determined through Kmeans algorithm, adding the 'RMS window’ based on the novelty rule, can better judge whether to increase hidden layer nodes or not. Meanwhile, the network parameters are adjusted by combining Least Mean Squares algorithm and Extended Kalman Filter algorithm, thus improving the learning rate. Since it is rather difficult to deal with the high dimension characteristics and complex semantic character of texts through words space text categorization method, we reduce the dimension and extract the semantic character space to the text input space through the semantic feature selection method. The experimental results show that the new RAN algorithm has the advantage of highspeed learning, compact network structure and good classification. Moreover, semantic feature selection can not only achieve the reduction of dimension and categorization time, but also raise the accuracy of the categorizing system effectively.

Key words: RAN learning algorithm;radial basis function;semantic feature selection;extended Kalman filter algorithm;least mean squares algorithm;text categorization