基于多源知识注入的常识问答方法研究

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (02): 349-360.

基于多源知识注入的常识问答方法研究

朱嘉骏,包美凯,张凯,刘烨,刘淇

(中国科学技术大学认知智能全国重点实验室,安徽合肥 230027)

收稿日期:2024-06-30 修回日期:2024-08-23 接受日期:2025-02-25 出版日期:2025-02-25 发布日期:2025-02-24

A commonsense question answering method based on multi-source knowledge infusion

ZHU Jiajun,BAO Meikai,ZHANG Kai,LIU Ye,LIU Qi

(State Key Laboratory of Cognitive Intelligence,University of Science and Technology of China,Hefei 230027,China)

Received:2024-06-30 Revised:2024-08-23 Accepted:2025-02-25 Online:2025-02-25 Published:2025-02-24

摘要/Abstract

摘要： 常识问答任务致力于让模型回答人类常识问题。针对该任务的一类方法是检索相关的知识来辅助模型回答常识问题。该类方法主要分为知识查询和知识推理2个步骤。知识查询是指根据问题检索到与之相关联的知识,而知识推理是指利用检索到的知识辅助回答常识问题。对此,常识问答面临的一个挑战是如何找到合适的外部知识来帮助回答问题。现有的许多常识问答模型通常依赖于单个外部知识源,但鉴于常识知识的广泛性和多样性,单一来源很难全面覆盖所需的所有知识。针对这一问题,提出了一种基于多源知识注入的常识问答方法。首先,在知识查询过程中为了应对知识覆盖度问题,利用预训练语言模型整合来自多个来源的知识（包括结构化和非结构化的知识）,形成统一的知识表征；其次,在知识推理过程中为了充分利用结构化知识蕴含的语义关系,模型识别文本中的实体概念和实体之间的关系路径从而构建实体关系图,然后，利用图注意力网络对实体关系图建模；最后，利用实体关系图和实体知识表征中的证据信息对问题进行推理和解答。所提方法经预训练得到的模型在CommonsenseQA数据集上的测试结果显示,基于多源知识注入的常识问答方法在验证集和测试集上的准确率分别达到79.20%和75.02%,超过了最好的基线模型。实验结果表明了多源知识注入方法在常识问答任务中的有效性。

关键词: 常识问答, 知识注入, 预训练语言模型, 图神经网络, 注意力机制

Abstract: Commonsense Question Answering is dedicated to having models answer questions that require human commonsense knowledge. A category of methods for this task is to retrieve relevant knowledge to assist the model in answering commonsense questions. This category of methods are mainly divided into two steps: knowledge retrieval and knowledge inference. Knowledge retrieval refers to retrieving the knowledge associated with question, while knowledge inference refers to using the retrieved knowledge to answer commonsense questions. In this regard, one of the challenges facing commonsense question answering is how to find appropriate external knowledge to help answer the question. Many existing commonsense question answering models usually rely on single source of external knowledge, but it is difficult for a single source of knowledge to comprehensively cover all the required knowledge. To address this problem, this paper proposes a commonsense question answering method based on multi-source knowledge infusion. Firstly, in order to cope with the knowledge coverage problem during knowledge retrieval, using pretrained language models to integrate knowledge from multiple sources (including structured and unstructured knowledge) to form a unified knowledge representation. Secondly, in order to make full use of the semantic relations embedded in structured knowledge during knowledge inference, model identify entity concepts and relationship paths between entities in the context to construct the entity relationship graph, and then use graph attention network to model the entity relationship graph. Finally, using the evidence information in the entity relationships graph and entity knowledge representations to reason and answer the questions. The experimental results on the CommonsenseQA dataset show that the accuracy of the commonsense question answering method based on multi-source knowledge infusion is 79.20% and 75.02% on the verification set and test set, respectively, which exceeds the best baseline models. This verifies the effectiveness of multi-source knowledge infusion method in commonsense question answering tasks.

Key words: commonsense question answering, knowledge infusion, pre-trained language model, graph neural network, attention mechanism

中图分类号:

朱嘉骏, 包美凯, 张凯, 刘烨, 刘淇. 基于多源知识注入的常识问答方法研究[J]. 计算机工程与科学, 2025, 47(02): 349-360.

ZHU Jiajun, BAO Meikai, ZHANG Kai, LIU Ye, LIU Qi. A commonsense question answering method based on multi-source knowledge infusion[J]. Computer Engineering & Science, 2025, 47(02): 349-360.

编辑推荐

Metrics

阅读次数

全文

105

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	105

来源	本网站	其他网站

次数	74	31
比例	70%	30%

摘要

最新录用	在线预览	正式出版

0	0	59

	来源	本网站

	次数	59
	比例	100%

[1]	侯萱, 梁志贞, 张磊, 刘佰龙, 张雪飞. 基于上下文全局空间图的轨迹用户链接[J]. 计算机工程与科学, 2025, 47(02): 336-348.
[2]	李瑞红, 李晓红, 姚锦, 王闪闪. 基于双通道异质超图神经网络的引文推荐方法[J]. 计算机工程与科学, 2025, 47(02): 361-369.
[3]	陈子雄, 陈旭, 景永俊, 宋吉飞. 基于图神经网络的源代码漏洞检测研究综述[J]. 计算机工程与科学, 2024, 46(10): 1775-1792.
[4]	陈昌奉, 赵宏州, 周恺卿. 基于图神经网络的代码抄袭检测方法[J]. 计算机工程与科学, 2024, 46(10): 1815-1824.
[5]	张悦, 张磊, 刘佰龙, 梁志贞, 张雪飞. 基于时空Transformer的多空间尺度交通预测模型[J]. 计算机工程与科学, 2024, 46(10): 1852-1863.
[6]	孙杰, 车文刚, 高盛祥. 面向多模态情感分析的低秩跨模态Transformer[J]. 计算机工程与科学, 2024, 46(10): 1888-1900.
[7]	袁佳伟, 赵进. 基于图神经网络的OMCI模型相似性计算[J]. 计算机工程与科学, 2024, 46(09): 1576-1586.
[8]	吴斯琦, 赵清华, 于雨晨. 基于元学习的图神经网络冷启动推荐[J]. 计算机工程与科学, 2024, 46(09): 1675-1684.
[9]	刘晓华, 徐茹枝, 杨成月. 一种基于多特征融合嵌入的中文命名实体识别模型研究[J]. 计算机工程与科学, 2024, 46(08): 1473-1481.
[10]	王谢中, 陈旭, 景永俊, 王叔洋. 基于异构图神经网络的半监督网站主题分类[J]. 计算机工程与科学, 2024, 46(04): 635-646.
[11]	余天赐, 高尚. 融合多结构信息的代码注释生成模型[J]. 计算机工程与科学, 2024, 46(04): 667-675.
[12]	李清风, 金柳, 马慧芳, 张若一. 双视图对比学习引导的多行为推荐方法[J]. 计算机工程与科学, 2024, 46(04): 707-715.
[13]	马雪, 何星星, 兰咏琪, 李莹芳. 一阶逻辑中基于treelet图神经网络的前提选择[J]. 计算机工程与科学, 2024, 46(02): 374-380.
[14]	孙庆骁, 刘轶, 杨海龙, 王一晴, 贾婕, 栾钟治, 钱德沛. GNNSched：面向GPU的图神经网络推理任务调度框架[J]. 计算机工程与科学, 2024, 46(01): 1-11.
[15]	赵文辉, 吴晓鸰, 凌捷, HOON Heo. 基于prompt tuning的中文文本多领域情感分析研究[J]. 计算机工程与科学, 2024, 46(01): 179-190.