• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (01): 134-141.

• 人工智能与数据挖掘 • 上一篇    下一篇

信息传递增强的神经机器翻译

史小静,宁秋怡,季佰军,段湘煜   

  1. (苏州大学自然语言处理实验室,江苏 苏州 215006)
  • 收稿日期:2020-03-17 修回日期:2020-05-08 接受日期:2021-01-25 出版日期:2021-01-25 发布日期:2021-01-22
  • 基金资助:
    国家自然科学基金(61673289)

Enhancing information transfer in neural machine translation

SHI Xiao-jing,NING Qiu-yi,JI Bai-jun,DUAN Xiang-yu   

  1. (Natural Language Processing Laboratory,Soochow University,Suzhou 215006,China)
  • Received:2020-03-17 Revised:2020-05-08 Accepted:2021-01-25 Online:2021-01-25 Published:2021-01-22

摘要: 神经机器翻译领域中多层神经网络结构能够显著提升翻译效果,但是多层神经网络结构存在信息传递的退化问题。为了缓解这一问题,提出了层间和子层间信息融合传递增强的方法,增强多层神经网络的层与层之间信息传递的能力。通过引入“保留门”机制来控制融合信息的传递权重,将融合信息与当前层的输出信息连接共同作为下一层的输入,使得信息传递更加充分。在目前最先进的多层神经网络Transformer上进行相关的实验,在中英和德英翻译任务上的实验结果表明,该信息传递增强方法相比于基线系统,BLEU得分分别提高了0.66和0.42。

关键词: 神经网络, 神经机器翻译, 信息传递, 信息退化, 残差网络, 门机制

Abstract: In the field of Neural Machine Translation (NMT), the multi-layer neural network model structure can significantly improve the translation performance. However, the structure of multi-layer neural network has an inherent problem with information transfer degeneracy. To alleviate this problem, this paper proposes an information transfer enhancement method by fusing layers information and sublayers information. By introducing a "retention gate" mechanism to control the fused information transfer weight, which is aggregated with the output of the current layer and then serves as the input of the next layer, thus making fuller information transfer between layers. Experiments were carried out on the most advanced NMT model Transformer. Experimental results on the Chinese-English and German-English tasks show that our method improves BLEU score by 0.66, and 0.42 in comparison to the baseline system. 




Key words: neural network, neural machine translation, information transfer, information degene- , racy, residual network, gate mechanism