• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

English-Chinese translation based on
an improved seq2seq model

XIAO Xinfeng1,2,LI Shijun2,YU Wei2,LIU Jie2,LIU Beixiong1   

  1. (1.Department of Mechanical and Electrical Engineering,
    Guangdong Polytechnic of Environmental Protection Engineering,Foshan 528216;
    2.School of Computer Science,Wuhan University,Wuhan 430079,China)
  • Received:2018-10-13 Revised:2018-12-10 Online:2019-07-25 Published:2019-07-25

Abstract:

Current machine translation systems optimize and evaluate the translation process in Indo-European languages to enhance translation accuracy. But researches about Chinese language are few. At present the seq2seq model is the best method in the field of machine translation, which is a neural machine translation model based on the attention mechanism. However, it does not take into account the grammar transformation between different languages. We propose a new optimized English-Chinese translation model. It uses different methods to preprocess texts and initialize embedding layer parameters. Additionally, to improve the seq2seq model structure, a transform layer between the encoder and the decoder is added to deal with grammar transformation problems. Preprocessing can reduce the parameter size and training time of the translation model by 20%, and the translation performance is increased by 0.4 BLEU. The translation performance of the seq2seq model with a transform layer is improved by 0.7 to 1.0 BLEU. Experiments show that compared to the existing seq2seq mainstream model based on the attention mechanism, the training time for English-Chinese translation tasks is the same for corpus of different sizes, but the translation performance of the proposal is improved by 1 to 2 BLEU.

Key words: deep learning, neural machine translation, seq2seq model, attention mechanism, named entity recognition