• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (12): 2266-2272.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于多级语义信息融合编码的序列标注方法

蔡雨岐,郭卫斌   

  1. (华东理工大学信息科学与工程学院,上海 200237)
  • 收稿日期:2021-03-08 修回日期:2021-07-21 接受日期:2022-12-25 出版日期:2022-12-25 发布日期:2023-01-05
  • 基金资助:
    国家自然科学基金(61672227)

A multi-level semantic information fusion coding method for sequence labeling

 CAI Yu-qi,GUO Wei-bin   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
  • Received:2021-03-08 Revised:2021-07-21 Accepted:2022-12-25 Online:2022-12-25 Published:2023-01-05

摘要: 序列标注是自然语言处理领域的基本任务。目前大多数序列标注方法采用循环神经网络及其变体直接提取序列中的上下文语义信息,尽管有效地捕捉到了词之间的连续依赖关系并取得了不错的性能,但捕获序列中离散依赖关系的能力不足,同时也忽略了词与标签之间的联系。因此,提出了一种多级语义信息融合编码方式,首先,通过双向长短期记忆网络提取序列上下文语义信息;然后,利用注意力机制将标签语义信息添加到上下文语义信息中,得到融合标签语义信息的上下文语义信息;接着,引入自注意力机制捕捉序列中的离散依赖关系,得到含有离散依赖关系的上下文语义信息;最后,使用融合机制将3种语义信息融合,得到一种全新的语义信息。实验结果表明,相比于采用循环神经网络或其变体对序列直接编码的方式,多级语义信息融合编码方式能明显提升模型性能。

关键词: 序列标注, 多级语义信息融合编码, 标签语义信息, 注意力机制, 融合机制

Abstract: Sequence labeling is a basic task in the field of natural language processing. At present, most sequence labeling methods use recurrent neural networks and their variants to directly extract the contextual semantic information in the sequence. Although they can effectively capture continuous dependencies between words and achieve good performance, they are not capable of capturing discrete dependencies in sequences, and also ignores the relationship between words and labels. Therefore, this paper proposes a multi-level semantic information fusion coding method. Firstly, the sequence context semantic information is extracted through the bidirectional long and short-term memory network. Secondly, the label semantic information is added to the context semantic information using the attention mechanism to obtain the new semantic information. Thirdly, a self-attention mechanism is used to capture discrete dependencies in the sequence, and obtain contextual semantic information containing discrete dependencies. Finally, a fusion mechanism is used to fuse the three types of semantic information to obtain a brand-new semantic information. The results prove that the proposed multi-level semantic information fusion encoding method significantly improves the model performance in comparison to the method of directly encoding the sequence using the recurrent neural network or its variants.

Key words: sequence labeling, multi-level semantic information fusion coding, label semantic information, attention mechanism, fusion mechanism