基于预训练的无监督神经机器翻译模型研究

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (04): 730-736.

基于预训练的无监督神经机器翻译模型研究

薛擎天,李军辉,贡正仙,徐东钦

（苏州大学自然语言处理实验室，江苏苏州 215006）

收稿日期:2020-08-26 修回日期:2020-11-30 接受日期:2022-04-25 出版日期:2022-04-25 发布日期:2022-04-20
基金资助:
国家自然科学基金（61876120）

Unsupervised neural machine translation model based on pre-training

XUE Qing-tian，LI Jun-hui，GONG Zheng-xian,XU Dong-qin

(Natural Language Processing Laboratory,Soochow University,Suzhou 215006,China)

Received:2020-08-26 Revised:2020-11-30 Accepted:2022-04-25 Online:2022-04-25 Published:2022-04-20

摘要/Abstract

摘要： 依赖于大规模的平行语料库，神经机器翻译在某些语言对上已经取得了巨大的成功。无监督神经机器翻译UNMT又在一定程度上解决了高质量平行语料库难以获取的问题。最近的研究表明，跨语言模型预训练能够显著提高UNMT的翻译性能，其使用大规模的单语语料库在跨语言场景中对深层次上下文信息进行建模，获得了显著的效果。进一步探究基于跨语言预训练的UNMT，提出了几种改进模型训练的方法，针对在预训练之后UNMT模型参数初始化质量不平衡的问题，提出二次预训练语言模型和利用预训练模型的自注意力机制层优化UNMT模型的上下文注意力机制层2种方法。同时，针对UNMT中反向翻译方法缺乏指导的问题，尝试将Teacher-Student框架融入到UNMT的任务中。实验结果表明，在不同语言对上与基准系统相比，本文的方法最高取得了0.8 ~ 2.08个百分点的双语互译评估(BLEU)值的提升。

关键词: 神经网络, 神经机器翻译, 无监督, 预训练

Abstract: Depending on the large-scale parallel corpus, neural machine translation has achieved great success in some language pairs. Subsequently, unsupervised neural machine translation (UNMT) has partly solved the problem that high quality corpus is difficult to obtain. Recent studies show that cross-lingual language model pretraining can significantly improve the translation performance of UNMT. This method models deep context information in cross-lingual language scenarios by using a large-scale monolingual corpus, and obtains significant results. This paper further explores UNMT based on cross-lingual language pretraining, proposes several improved methods of training model, and compares the performance between UNMT and baseline system on different language pairs. Aiming at the issue of unbalanced initialization of unsupervised NMT parameters when using pre-trained models, this paper proposes a secondary pre-training stage to continue pre-training, and propose to initialize the Cross attention sub-layer with the self-attention sub-layer in unsupervised NMT model. Meanwhile, as back- translation plays a critical role in unsupervised NMT, we propose to use Teacher-Student framework to guide back-translation.Experimental results show that, compared with the baseline system, these methods improve BLEU by 0.8~2.08 percentages at most.

Key words: neural network, neural machine translation, unsupervised, pre-training

薛擎天, 李军辉, 贡正仙, 徐东钦. 基于预训练的无监督神经机器翻译模型研究[J]. 计算机工程与科学, 2022, 44(04): 730-736.

XUE Qing-tian, LI Jun-hui, GONG Zheng-xian, XU Dong-qin. Unsupervised neural machine translation model based on pre-training[J]. Computer Engineering & Science, 2022, 44(04): 730-736.

编辑推荐

Metrics

阅读次数

全文

323

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	323

来源	本网站	其他网站

次数	278	45
比例	86%	14%

摘要

183

最新录用	在线预览	正式出版

0	0	183

	来源	本网站

	次数	183
	比例	100%

[1]	姜云卓, 贡正仙. 基于修辞结构的篇章级神经机器翻译[J]. 计算机工程与科学, 2025, 47(01): 180-190.
[2]	杜连成, 郭军军, 叶俊杰, 余正涛, . 双级交互式自适应融合的多模态神经机器翻译[J]. 计算机工程与科学, 2024, 46(11): 2071-2080.
[3]	范琪, 王善敏, 刘成广, 刘青山. 类别特征约束的多目标域表情识别方法[J]. 计算机工程与科学, 2024, 46(05): 836-845.
[4]	李新洁, 王文君, 董凌, 赖华, 余正涛, 高盛祥, . 基于多特征交互融合的老挝语无监督音素分割方法[J]. 计算机工程与科学, 2024, 46(05): 937-944.
[5]	申影利, 赵小兵, . 语言模型蒸馏的低资源神经机器翻译方法[J]. 计算机工程与科学, 2024, 46(04): 743-751.
[6]	阳予晋, 王堃, 陈志刚, 徐悦, 李斌. 基于胶囊网络的异常多分类模型[J]. 计算机工程与科学, 2024, 46(03): 427-439.
[7]	陈欢欢, 王剑, Muhammad Naeem Ul Hassan. 融合乌尔都语词性序列预测的汉乌神经机器翻译[J]. 计算机工程与科学, 2024, 46(03): 518-524.
[8]	王姗姗, 汪梦竹, 骆志刚. 局部判别损失无监督域适应方法[J]. 计算机工程与科学, 2024, 46(01): 132-141.
[9]	印杰, 黄肖宇, 刘家银, 牛博威, 谢文伟, . 基于预训练语言模型的安卓恶意软件检测方法[J]. 计算机工程与科学, 2023, 45(08): 1433-1442.
[10]	张迎晨, 高盛祥, 余正涛, 王振晗, 毛存礼, . 融合BERT与词嵌入双重表征的汉越神经机器翻译方法[J]. 计算机工程与科学, 2023, 45(03): 546-553.
[11]	董佩杰, 牛新, 魏自勉, 陈学晖. 单次神经网络结构搜索研究综述[J]. 计算机工程与科学, 2023, 45(02): 191-203.
[12]	肖妮妮, 金畅, 段湘煜. 基于提高伪平行句对质量的无监督领域适应机器翻译[J]. 计算机工程与科学, 2022, 44(12): 2230-2237.
[13]	王煦, 贾浩, 季佰军, 段湘煜. 基于词典模型融合的神经机器翻译[J]. 计算机工程与科学, 2022, 44(08): 1481-1487.
[14]	李方, 吴国栋, 涂立静, 刘玉良, 查志康, 李景霞. 图自编码器推荐研究综述[J]. 计算机工程与科学, 2022, 44(02): 335-344.
[15]	尤丛丛, 高盛祥, 余正涛, 毛存礼, 潘润海, . 基于同义词数据增强的汉越神经机器翻译方法[J]. 计算机工程与科学, 2021, 43(08): 1497-1502.