• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (5): 1046-1051.

• 论文 • Previous Articles     Next Articles

Recognition of geographical names in Mongolian
based on conditional random fields and dictionary   

WU Jinxing1,LI Li1,YANG Zhenxin2   

  1. (1.School of Mongolian Studies,Inner Mongolia University,Huhhot 010021;
    2.Institute of Intelligent Machine,Chinese Academy of Science,Hefei 230031,China)
  • Received:2015-10-29 Revised:2015-12-10 Online:2016-05-25 Published:2016-05-25

Abstract:

This is the first realization of Mongolian geographical names recognition based on conditional random fields. First we analyze the existing forms and characteristics of the geographical names in the corpus from the aspect of Mongolian adhesion characteristic. In addition to designation words and the part of speech, lexical features are also introduced as the location feature of geographical names. Then unrecognized names are called by location dictionaries. Taking the 3rdlevel annotated corpus with about 1000,000 words as the training data, the proposed model achieves an accuracy of 94.68%, a recall rate of 84.40%, and a F score of 89.24%.

Key words: Mongolian geographical name;recognition;CRF;feature;dictionary