基于多级语义信息融合编码的序列标注方法

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (12): 2266-2272.

基于多级语义信息融合编码的序列标注方法

蔡雨岐，郭卫斌

（华东理工大学信息科学与工程学院,上海 200237）

收稿日期:2021-03-08 修回日期:2021-07-21 接受日期:2022-12-25 出版日期:2022-12-25 发布日期:2023-01-05
基金资助:
国家自然科学基金(61672227)

A multi-level semantic information fusion coding method for sequence labeling

CAI Yu-qi，GUO Wei-bin

（School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China）

Received:2021-03-08 Revised:2021-07-21 Accepted:2022-12-25 Online:2022-12-25 Published:2023-01-05

摘要/Abstract

摘要： 序列标注是自然语言处理领域的基本任务。目前大多数序列标注方法采用循环神经网络及其变体直接提取序列中的上下文语义信息，尽管有效地捕捉到了词之间的连续依赖关系并取得了不错的性能，但捕获序列中离散依赖关系的能力不足，同时也忽略了词与标签之间的联系。因此，提出了一种多级语义信息融合编码方式，首先，通过双向长短期记忆网络提取序列上下文语义信息;然后,利用注意力机制将标签语义信息添加到上下文语义信息中，得到融合标签语义信息的上下文语义信息;接着,引入自注意力机制捕捉序列中的离散依赖关系，得到含有离散依赖关系的上下文语义信息;最后,使用融合机制将3种语义信息融合，得到一种全新的语义信息。实验结果表明，相比于采用循环神经网络或其变体对序列直接编码的方式,多级语义信息融合编码方式能明显提升模型性能。

关键词: 序列标注, 多级语义信息融合编码, 标签语义信息, 注意力机制, 融合机制

Abstract: Sequence labeling is a basic task in the field of natural language processing. At present, most sequence labeling methods use recurrent neural networks and their variants to directly extract the contextual semantic information in the sequence. Although they can effectively capture continuous dependencies between words and achieve good performance, they are not capable of capturing discrete dependencies in sequences, and also ignores the relationship between words and labels. Therefore, this paper proposes a multi-level semantic information fusion coding method. Firstly, the sequence context semantic information is extracted through the bidirectional long and short-term memory network. Secondly, the label semantic information is added to the context semantic information using the attention mechanism to obtain the new semantic information. Thirdly, a self-attention mechanism is used to capture discrete dependencies in the sequence, and obtain contextual semantic information containing discrete dependencies. Finally, a fusion mechanism is used to fuse the three types of semantic information to obtain a brand-new semantic information. The results prove that the proposed multi-level semantic information fusion encoding method significantly improves the model performance in comparison to the method of directly encoding the sequence using the recurrent neural network or its variants.

Key words: sequence labeling, multi-level semantic information fusion coding, label semantic information, attention mechanism, fusion mechanism

蔡雨岐, 郭卫斌. 基于多级语义信息融合编码的序列标注方法[J]. 计算机工程与科学, 2022, 44(12): 2266-2272.

CAI Yu-qi, GUO Wei-bin. A multi-level semantic information fusion coding method for sequence labeling[J]. Computer Engineering & Science, 2022, 44(12): 2266-2272.

编辑推荐

Metrics

阅读次数

全文

300

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	300

来源	本网站	其他网站

次数	237	63
比例	79%	21%

摘要

196

最新录用	在线预览	正式出版

0	0	196

	来源	本网站

	次数	196
	比例	100%

[1]	徐超, 阮荣耀, 陈勇, . 一种基于区块链的医疗数据审计方法[J]. 计算机工程与科学, 2025, 47(01): 95-106.
[2]	陈兆波, 张琳, 马晓轩. 改进注意力混合自动编码器视频异常检测研究[J]. 计算机工程与科学, 2025, 47(01): 130-139.
[3]	付燕, 杨旭, 叶鸥. 基于CNN和Transformer特征融合的烟雾识别方法[J]. 计算机工程与科学, 2024, 46(11): 2045-2052.
[4]	余佳妮, 胡朝霞, 蒋从锋. 一种基于多特征的日志事件异常检测方法研究[J]. 计算机工程与科学, 2024, 46(09): 1587-1597.
[5]	刘国岐, 何廷年, 荣艺煊, 李卓然. 基于用户轨迹和好友关系的兴趣点推荐[J]. 计算机工程与科学, 2024, 46(09): 1693-1701.
[6]	刘晓华, 徐茹枝, 杨成月. 一种基于多特征融合嵌入的中文命名实体识别模型研究[J]. 计算机工程与科学, 2024, 46(08): 1473-1481.
[7]	张永智, 何可人, 戈珏. 改进YOLOv7网络在低空遥感图像目标检测中的应用[J]. 计算机工程与科学, 2024, 46(07): 1269-1277.
[8]	王泽宇, 徐慧英, 朱信忠, 李琛, 刘子洋, 王子奕. 基于YOLOv8改进的密集行人检测算法：MER-YOLO[J]. 计算机工程与科学, 2024, 46(06): 1050-1062.
[9]	邓翔宇, 裴浩媛, 盛迎. 基于网络融合的改进MobileViT人脸表情识别[J]. 计算机工程与科学, 2024, 46(06): 1072-1080.
[10]	张玉莹, 朱广丽, 谈光璞, . 基于情感增强和语义依存的金融隐式情感分析模型[J]. 计算机工程与科学, 2024, 46(06): 1112-1120.
[11]	尹春勇, 赵峰. 基于双层注意力和深度自编码器的时间序列异常检测模型[J]. 计算机工程与科学, 2024, 46(05): 826-835.
[12]	赵金源, 贾迪. 改进YOLOv5的多人姿态估计修正算法[J]. 计算机工程与科学, 2024, 46(05): 852-860.
[13]	马长林, 孙状. 基于实体知识的远程监督关系抽取[J]. 计算机工程与科学, 2024, 46(05): 945-950.
[14]	曹浩东, 汪海涛, 贺建峰. 融合序列局部信息的日期感知序列推荐算法[J]. 计算机工程与科学, 2024, 46(04): 734-742.
[15]	要媛媛, 刘宇航, 程雨菁, 彭梦晓, 郑文, . 融合多注意力机制的自监督小样本医学图像分割[J]. 计算机工程与科学, 2024, 46(03): 479-487.