• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2020, Vol. 42 ›› Issue (12): 2265-2272.

Previous Articles     Next Articles

Chinese news text abstractive summarization with keywords fusion

NING Shan1,2,YAN Xin1,2,XU Guang-yi3,ZHOU Feng1,2,ZHANG Lei1,2   

  1. (1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650504;

    2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,650504;

    3.Yunnan Nantian Electronic Information Industry Co.,Ltd.,Kunming 650040,China)


  • Received:2019-12-18 Revised:2020-03-26 Accepted:2020-12-25 Online:2020-12-25 Published:2021-01-05

Abstract: The existing seq2seq model often suffers from semantic irrelevance when generating summaries, and does not consider the role of keywords in summary generation. Aiming at this problem, this paper proposes a Chinese news text abstractive summarization method with keywords fusion. Firstly, the source text words are input into the Bi-LSTM model in order. The obtained hidden state is input to the sliding convolutional neural network, so local features between each word and adjacent words are extracted. Secondly, keyword information and gating unit are used to filter news text information, so as to remove redundant information. Thirdly, the global feature information of each word is obtained through the self-attention mechanism, and the hierarchical combination of local and global word features representation is obtained after encoding. Finally, the encoded word feature representation is input into the LSTM model with the attention mechanism to decode the summary information. The method models the n-gram features of news words through a sliding convolutional network. Based on this, the self-attention mechanism is used to obtain hierarchical local and global word feature representations. At the same time, the important role of keywords in abstractive summary is considered, and the gating unit is used to remove redundant information to obtain more accurate news text information. Experiments on Sogou's news corpus show that this method can effectively improve the quality of summary generation, and effectively enhance the values of ROUGE-1、ROUGE-2、ROUGE-L.

Key words: text abstractive summarization, sliding convolutional network, keyword information fusion, gating unit, global coding