• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• • 上一篇    下一篇

ERC-KG:结合大语言模型的领域知识图谱构建方法

李相成, 汪永伟, 李强, 刘鹏程, 唐继鹏   

  1. (中国人民解放军网络空间部队 信息工程大学 密码工程学院,河南省 郑州市  450000)

ERC-KG: A Method for Constructing Domain Knowledge Graphs by Integrating Large Language Models

LI Xiangcheng, WANG Yongwei, LI Qiang, LIU Pengcheng, TANG Jipeng   

  1. (Department of Cryptographic Engineering, Information Engineering University, Henan Zhengzhou 450000, China)

摘要: 传统知识图谱构建主要依赖于数据预处理、实体识别、关系抽取以及实体对齐等技术手段,此类方法通常伴随高昂的计算与时间开销。针对这一问题,提出一种结合大语言模型抽取、检索和纠错的知识图谱构建方法,以优化知识图谱的生成效率与准确性。通过特征词抽取与领域专家知识相结合的方式精准确定知识图谱的实体集合,构建实体语料检索器,筛选与目标实体最相关的上下文语句作为大语言模型的输入。设计提示模板和验证反馈机制,实现高质量三元组抽取,并完成国防科技领域知识图谱构建。实验结果表明,图谱构建精确率达到94.32%,验证了方法的精确性与合理性,为领域知识图谱的快速构建贡献了新的研究思路。

关键词: 大语言模型, 知识图谱构建, 提示学习

Abstract: The construction of traditional knowledge graphs mainly relies on technical means such as data preprocessing, entity recognition, relation extraction, and entity alignment. Such methods are usually accompanied by high computational and time costs. To address this issue, a domain knowledge graph construction meth-od ERC-KG (Extraction Retrieval and Error Correction Knowledge Graph) is proposed, which combines large language models to optimize the efficiency and accuracy of knowledge graph generation. By combining feature word extraction with domain expert knowledge, the entity set of the knowledge graph is precisely determined. An entity corpus retriever is con-structed to select the context sentences most relevant to the target entity as the input of the large language model. A prompt template and verification feedback mechanism are designed to achieve high-quality triple extraction and complete the construction of the knowledge graph in the field of national defense science and technology. Experimental results show that the accuracy of the graph construction reaches 94.32%, verifying the accuracy and rationality of the method and also contributing new research ideas for the rapid construction of domain knowledge graphs.


Key words: large language model, knowledge graph construction, prompt learning