• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (12): 2253-2260.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Domain oriented discontinuous named entity recognition based on large language model

#br#

TANG  Jintao,ZHANG Chengxian,BAO Chenlong,LI Wenjing#br#
  

  1. (College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China) 
  • Received:2024-05-11 Revised:2024-08-18 Online:2025-12-25 Published:2026-01-06

Abstract: In professional fields, the compositional logic between terms is more complex, leading to issues such as complex entities represented by discontinuous named entities. To address the task of discontinuous named entity recognition (DNER), this paper proposes a recognition method that leverages the understanding and generation capabilities of large language models (LLMs). This method  discontinuous entity recognition as a sentence rewriting task: It designs rules to convert discontinuous named entity recognition datasets into sentence rewriting datasets, and performs output fine-tuning on the large language model. In the named entity recognition phase, based on the rewritten sentences, it designs rule-based instructions using prompt learning, and implicitly prompts the large language model with domain-specific information (e.g., the field of the data) through character role dialogue, which further improves the entity recognition performance. Experimental results show that on three datasets, this method improved F1 scores by 3.23%, 0.28%, and 1.04% respectively compared to the state-of-the-art (SOTA) methods based on small models on CSIRO adverse drug event corpus(CADEC), shared annotated resources 2013(ShARe13) and shared annotated resources 2014(ShARe14). These results verify that the generation capability of large models contributes to the complex task of named entity recognition in professional fields.


Key words: named entity recognition, large language model (LLM), discontinuous named entity