• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (04): 730-736.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Unsupervised neural machine translation model based on pre-training

XUE Qing-tian,LI Jun-hui,GONG Zheng-xian,XU Dong-qin   

  1. (Natural Language Processing Laboratory,Soochow University,Suzhou 215006,China)
  • Received:2020-08-26 Revised:2020-11-30 Accepted:2022-04-25 Online:2022-04-25 Published:2022-04-20

Abstract: Depending on the large-scale parallel corpus, neural machine translation has achieved great success in some language pairs. Subsequently, unsupervised neural machine translation (UNMT) has partly solved the problem that high quality corpus is difficult to obtain. Recent studies show that cross-lingual language model pretraining can significantly improve the translation performance of UNMT. This method models deep context information in cross-lingual language scenarios by using a large-scale monolingual corpus, and obtains significant results. This paper further explores UNMT based on cross-lingual language pretraining, proposes several improved methods of training model, and compares the performance between UNMT and baseline system on different language pairs. Aiming at the issue of unbalanced initialization of unsupervised NMT parameters when using pre-trained models, this paper proposes a secondary pre-training stage to continue pre-training, and propose to initialize the Cross attention sub-layer with the self-attention sub-layer in unsupervised NMT model. Meanwhile, as back- translation plays a critical role in unsupervised NMT, we propose to use Teacher-Student framework to guide back-translation.Experimental results show that, compared with the baseline system, these methods improve BLEU by 0.8~2.08 percentages at most.



Key words: neural network, neural machine translation, unsupervised, pre-training