• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

An MLFM-MN short text classification
algorithm based on TNG feature extension

WEN Wu1,2,3,LI Pei-qiang1,2,GUO You-qing1,2   

  1. (1.School of Communication and Information Engineering,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    2.Research Center of New Communication Technology Applications,
    Chongqing University of Posts and Telecommunications,Chongqing 400065;
    3.Chongqing Xinke Design Co.Ltd.,Chongqing 401121,China)
  • Received:2018-12-11 Revised:2019-04-25 Online:2019-11-25 Published:2019-11-25

Abstract:

Due to the problems of sparse features and high data dimension in short text, traditional text classification methods cannot achieve the desired classification rate and accuracy. Aiming at this problem, we propose a multi-level fuzzy minimum and maximum neural network (MLFM-MN) short text classification algorithm based on topic N-Gram (TNG) feature extension. The algorithm first constructs a feature extension library and extends the features by using the improved TNG model. The extension library can not only infer the word distribution, but also infer the phrase distribution of each topic text, and then calculate these based on the original features in the short text. Appropriate candidate words and phrases are selected from the feature extension library according to topic tendencies, and put  into the original text. Finally, the extended text objects are classified by the MLFM-MN algorithm. We use accuracy rate, recall rate and F1 score to evaluate the classification effect. The results show that the proposed algorithm can significantly improve text classification performance.
 

Key words: sparse feature, TNG model, fuzzy neural network, extension library, topic tendency