基于对span的预判断和多轮分类的实体关系抽取

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (05): 916-928.

基于对span的预判断和多轮分类的实体关系抽取

佟缘，姚念民

（大连理工大学计算机科学与技术学院，辽宁大连 116024）

收稿日期:2023-02-06 修回日期:2023-04-19 接受日期:2024-05-25 出版日期:2024-05-25 发布日期:2024-05-30

Entity relation extraction based on prejudgment and multi-round classification for span

TONG Yuan,YAO Nian-min

(School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,China)

Received:2023-02-06 Revised:2023-04-19 Accepted:2024-05-25 Online:2024-05-25 Published:2024-05-30

摘要/Abstract

摘要： 针对自然语言处理领域中的实体识别和关系抽取任务，提出一种对词元序列（Token Sequence,又称span）进行预测的模型Smrc。模型整体上利用BERT预训练模型作为编码器,另外包含实体预判断(Pej)、实体多轮分类(Emr)和关系多轮分类(Rmr)3个模块。Smrc模型通过Pej模块的初步判断及Emr模块的多轮实体分类来进行实体识别，再利用Rmr模块的多轮关系分类来判断实体对间的关系，进而完成关系抽取任务。在CoNLL04、SciERC和ADE 3个实验数据集上，Smrc模型的实体识别F1值分别达到89.67%,70.62%和89.56%，关系抽取F1值分别达到73.11%，51.03%和79.89%，相较之前在3个数据集上的最佳模型Spert，Smrc模型凭借实体预判断和实体及关系多轮分类，在2个子任务上其F1值分别提高了0.73%,0.29%，0.61%及1.64%,0.19%,1.05%，表明了该模型的有效性及其优势。

关键词: 对span的预判断, 实体关系抽取, BERT预训练模型, 多轮实体分类, 多轮关系分类

Abstract: Aiming at entity recognition and relation extraction tasks in natural language processing, a model named Smrc is proposed, which makes predictions at the token sequence (span) level. The model uses BERT pre-training model as an encoder and include three modules: entity pre-judgment (Pej), entity multi-round classification (Emr) and relation multi-round classification (Rmr). The Smrc model performs entity recognition through the preliminary judgment of the Pej module and the multi-round entity classification of the Emr module, and then uses the Rmr module’s multi-round relation classification to determine the relationships between entities, thus completing the relation extraction task. On the experimental datasets of CoNLL04, SciERC, and ADE, the F1 values of entity recognition reach 89.67%, 70.62%, and 89.56%, respectively, and the F1 values of relation extraction reach 73.11%, 51.03%, and 79.89%, respectively. Compared with the previous best model Spert on the three datasets, the Smrc model achieves improvements of 0.73%, 0.29%, and 0.61% in entity recognition and 1.64%, 0.19%, and 1.05% in relation extraction through entity pre-judgment and multi-round classification of entities and relations, which demonstrates the effectiveness and advantages of the model.

Key words: pre-judgment of span, entity relation extraction, BERT pretraining model, multi-round entity classification, multi-round relation classification

佟缘, 姚念民. 基于对span的预判断和多轮分类的实体关系抽取[J]. 计算机工程与科学, 2024, 46(05): 916-928.

TONG Yuan, YAO Nian-min. Entity relation extraction based on prejudgment and multi-round classification for span[J]. Computer Engineering & Science, 2024, 46(05): 916-928.

编辑推荐

Metrics

阅读次数

全文

380

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	380

来源	本网站	其他网站

次数	276	104
比例	73%	27%

摘要

164

最新录用	在线预览	正式出版

0	0	164

	来源	本网站

	次数	164
	比例	100%

[1]	徐捷, 邵玉斌, 杜庆治, 龙华, 马迪南. 结合混合特征提取与深度学习的长文本语义相似度计算[J]. 计算机工程与科学, 2024, 46(08): 1513-1520.
[2]	吉旭瑞, 魏德健, 张俊忠, 张帅, 曹慧. 中文电子病历信息提取方法研究综述[J]. 计算机工程与科学, 2024, 46(02): 325-337.
[3]	闫雄, 段跃兴, 张泽华. 采用自注意力机制和CNN融合的实体关系抽取[J]. 计算机工程与科学, 2020, 42(11): 2059-2066.