• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (03): 518-524.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Chinese-Urdu neural machine translation interacting POS sequence prediction in Urdu language

CHEN Huan-huan1,2,WANG Jian1,2,Muhammad Naeem Ul Hassan1,2   

  1. (1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500;
    2.Key Laboratory of Artificial Intelligence in Yunnan Province,
    Kunming University of Science and Technology,Kunming 650500,China)
  • Received:2023-01-11 Revised:2023-03-22 Accepted:2024-03-25 Online:2024-03-25 Published:2024-03-18

Abstract: At present, many research teams have conducted in-depth research on minority language machine translation for South and Southeast Asia. However, as the official language of Pakistan, Urdu has limited data resources and a significant gap from Chinese, resulting in a lack of targeted research on Chinese-Urdu machine translation methods. To address this issue, this paper proposes a Chinese-Urdu neural machine translation model based on Transformer and incorporating Urdu part-of-speech sequence prediction. Firstly, Transformer is used to predict the part-of-speech sequence of the target language Urdu. Then, the translation model’s prediction results are combined with the part-of-speech sequence prediction model's results to jointly predict the final translation, thereby integrating language knowledge into the translation model. Experimental results on a small-scale Chinese-Urdu dataset show that the proposed method has a BLEU score of 0.13 higher than the baseline model on the dataset, achieving significant improvement.

Key words: Transformer, neural machine translation, Urdu, part of speech sequence