融合BERT与词嵌入双重表征的汉越神经机器翻译方法

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (03): 546-553.

融合BERT与词嵌入双重表征的汉越神经机器翻译方法

张迎晨1,2,高盛祥1,2,余正涛1,2,王振晗1,2,毛存礼1,2

(1.昆明理工大学信息工程与自动化学院，云南昆明 650500；
2.昆明理工大学云南省人工智能重点实验室，云南昆明 650500)

收稿日期:2021-03-28 修回日期:2021-07-22 接受日期:2023-03-25 出版日期:2023-03-25 发布日期:2023-03-23
基金资助:
国家自然科学基金（61972186,U21B2027,61732005,61761026,61672271,61762056）；国家重点研发计划（2019QY1802,2019QY1801,2019QY1800）；云南高科技人才项目（201606）；云南省重大科技专项（202103AA080015,202002AD080001）;云南省基础研究计划（202001AS070014，2018FB104）；昆明理工大学省级人培项目（KKSY201703005）

A Chinese-Vietnamese neural machine translation method using the dual representation of BERT and word embedding

ZHANG Ying-chen1,2,GAO Sheng-xiang1,2,YU Zheng-tao1,2,WANG Zhen-han1,2,MAO Cun-li1,2

(1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500；
2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)

Received:2021-03-28 Revised:2021-07-22 Accepted:2023-03-25 Online:2023-03-25 Published:2023-03-23

摘要/Abstract

摘要： 神经机器翻译是目前主流的机器翻译方法，但在汉-越这类低资源的机器翻译任务中，由于双语平行语料规模小，神经机器翻译的效果并不理想。考虑到预训练语言模型中包含丰富的语言信息，将预训练语言模型表征融入神经机器翻译系统可能对低资源的机器翻译有积极作用，提出一种融合BERT预训练语言模型与词嵌入双重表征的低资源神经机器翻译方法。首先，使用预训练语言模型和词嵌入分别对源语言序列进行表示学习，通过注意力机制建立2种表征之间的联系后,使用拼接操作得到双重表征向量；然后，经过线性变换和自注意力机制，使词嵌入表征和预训练语言模型表征完全自适应融合在一起，得到对输入文本的充分表征，以此提高神经机器翻译模型性能。在汉越语言对上的翻译结果表明，相比基准系统，在包含127 000个平行句对的汉越训练数据中，该方法的BLEU值提升了1.99，在包含70 000个平行句对的汉越训练数据中，该方法的BLEU值提升了4.34，表明融合BERT预训练语言模型和词嵌入双重表征的方法能够有效提升汉越机器翻译的性能。

关键词: 神经机器翻译, 预训练语言模型, 词嵌入, 汉语-越南语 , ,

Abstract: Neural machine translation is the current mainstream machine translation method. However, in low-resource machine translation tasks such as Chinese-Vietnamese, the effect of neural machine translation is not ideal due to the small scale of bilingual parallel corpus. Considering that the pre-trained language model contains rich language information, incorporating the pre-trained language model into a neural machine translation system may have a positive effect on low-resource machine translation. Therefore, this paper proposes a low-resource neural machine translation method that combines the dual representation of BERT pre-training language model and word embedding. The pre-training language model and word embedding are used to learn the representation of the source language sequence. The connection between the two representations are established through the attention mechanism. The splic- ing operation is performed to obtain the dual representation vector. Through the linear transformation and self-attention mechanism, the word embedding representation and the pre-trained language model representation are fully adaptively fused together to obtain a sufficient representation of the input text, thereby improving the performance of the neural machine translation model. The translation experiment on the Chinese-Vietnamese language pair shows that, compared with the benchmark system, the method obtains an increase of 1.99 BLEU in the 127k-scale Chinese-Vietnamese training data, and an increase of 4.34 BLEU in the 70k-scale Chinese-Vietnamese training data, which proves that the fusion of BERT pre-training language model and dual representation of word embedding can effectively improve the performance of Chinese-Vietnamese machine translation.

Key words: neural machine translation, pre-trained language model, word embedding, Chinese- Vietnamese

张迎晨, 高盛祥, 余正涛, 王振晗, 毛存礼, . 融合BERT与词嵌入双重表征的汉越神经机器翻译方法[J]. 计算机工程与科学, 2023, 45(03): 546-553.

ZHANG Ying-chen, GAO Sheng-xiang, YU Zheng-tao, WANG Zhen-han, MAO Cun-li, . A Chinese-Vietnamese neural machine translation method using the dual representation of BERT and word embedding[J]. Computer Engineering & Science, 2023, 45(03): 546-553.

编辑推荐

Metrics

阅读次数

全文

303

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	303

来源	本网站	其他网站

次数	243	60
比例	80%	20%

摘要

198

最新录用	在线预览	正式出版

0	0	197

	来源	本网站

	次数	198
	比例	100%

[1]	姜云卓, 贡正仙. 基于修辞结构的篇章级神经机器翻译[J]. 计算机工程与科学, 2025, 47(01): 180-190.
[2]	孙杰, 车文刚, 高盛祥. 面向多模态情感分析的低秩跨模态Transformer[J]. 计算机工程与科学, 2024, 46(10): 1888-1900.
[3]	储旭, 宁爱兵, 胡开元, 代苏玉, 张惠珍. 最小支配阈值集问题的降阶回溯算法[J]. 计算机工程与科学, 2024, 46(05): 897-906.
[4]	赵文辉, 吴晓鸰, 凌捷, HOON Heo. 基于prompt tuning的中文文本多领域情感分析研究[J]. 计算机工程与科学, 2024, 46(01): 179-190.
[5]	刘琰, 张姣, 姜胜腾, 潘筱茜, 赵海涛, 魏急波. 一种多策略融合的人工蜂鸟算法[J]. 计算机工程与科学, 2023, 45(12): 2216-2225.
[6]	王英, 陈文祺, 韩耀郴. 基于元学习和图滤波器的节点分类研究[J]. 计算机工程与科学, 2023, 45(12): 2274-2280.
[7]	梁向斌, 赵宝康, 彭伟. 卫星网络传输优化新机制研究进展[J]. 计算机工程与科学, 2023, 45(11): 1949-1959.
[8]	王若宾, 耿芳东, 张永梅, 宋威, 王伟锋, 徐琳. 基于改进自适应DBSCAN的混合式MOOC视频观看模式挖掘[J]. 计算机工程与科学, 2023, 45(09): 1670-1678.
[9]	印杰, 黄肖宇, 刘家银, 牛博威, 谢文伟, . 基于预训练语言模型的安卓恶意软件检测方法[J]. 计算机工程与科学, 2023, 45(08): 1433-1442.
[10]	胡庆孟, 王红斌, 王俊钟. 基于NPN融入词性注意力机制的中文事件探测[J]. 计算机工程与科学, 2023, 45(08): 1490-1497.
[11]	喻金平, 朱伟锋, 廖列法. 基于RoBERTa-wwm-BiLSTM-CRF的扶持政策文本实体识别研究[J]. 计算机工程与科学, 2023, 45(08): 1498-1507.
[12]	刘南艳, 魏鸿飞, 马圣祥. 融合局部动态特征的面部表情识别[J]. 计算机工程与科学, 2023, 45(05): 849-858.
[13]	汪铮, 黄容, 吴茂文, 孙寅涵, 孙志刚. OpenEmulator:一种面向TSN芯片验证的联合仿真平台[J]. 计算机工程与科学, 2023, 45(03): 411-419.
[14]	马征, 褚钧正, 武鹏飞. 一种基于对抗学习的仿真遥感图像生成方法[J]. 计算机工程与科学, 2023, 45(03): 489-494.
[15]	杨春霞, 姚思诚, 宋金剑, . 一种融合字词信息的中文情感分析模型[J]. 计算机工程与科学, 2023, 45(03): 512-519.