A multi-hop knowledge graph reasoning method based on reinforcement learning

Abstract

Abstract: In recent years, applying reinforcement learning to knowledge reasoning has shown promising performance, but it faces two key challenges: agents’ tendency to engage in aimless explorations and issues related to delayed and sparse rewards. To address these challenges, a multi-hop knowledge reasoning model based on reinforcement learning and predictive information embedding is proposed. Firstly, a predictive embedding information acquisition module is designed to incorporate the obtained predictive information into the reinforcement learning framework, resolving the issue of agents getting trapped in aimless exploration and selecting ineffective actions. Then, an action pruning mechanism combining predictive information with the Dropout concept is introduced during the traversal process to alleviate the problem of an excessively large action space. Additionally, LSTM is employed to retain the agent’s historical decision-making information, enabling the agent to select the most promising actions at each step. Finally, a new reward function based on predictive information successfully mitigates the issues of delayed and sparse rewards. Experimental results on the WebQSP, PQL, and MetaQA datasets demonstrate that the proposed model exhibits efficient performance in knowledge reasoning tasks and is well-suited for multi-hop question answering on knowledge graphs.

Key words: knowledge graph, reinforcement learning, knowledge reasoning

HAN Zheng, XU Ruzhi, LIU Xiaohua. A multi-hop knowledge graph reasoning method based on reinforcement learning[J]. Computer Engineering & Science, 2026, 48(2): 256-267.

[1]	GAO Fucai, HE Tingnian, YANG Yang, YANG Jiangwei. GPR:A large language model enhancement method [J]. Computer Engineering & Science, 2026, 48(1): 162-171.
[2]	CHEN Ziyang, CHEN Jun, ZHU Yuhan, LIU Genggeng, HUANG Xing. A module placement algorithm based on deep reinforcement learning for fully programmable valve array biochip [J]. Computer Engineering & Science, 2026, 48(1): 40-50.
[3]	CHEN Junyan1, LI Xinmei1, ZHU Changhong2, XIAO Wei3. A routing optimization algorithm for software-defined optical transport network based on multi-view graph attention mechanism [J]. Computer Engineering & Science, 2025, 47(7): 1193-1204.
[4]	LI Tianyun, LI Tao, WEN Dong, YANG Hui, ZHANG Yutao, LUO Xin, DONG Dezun. A survey on artificial intelligence based congestion control [J]. Computer Engineering & Science, 2025, 47(6): 1018-1027.
[5]	DI Jian, WAN Xue, JIANG Limei, . An evolutionary reinforcement learning algorithm based on stochastic symmetric search [J]. Computer Engineering & Science, 2025, 47(5): 912-920.
[6]	WEI Dong , JIA Yuchen, HAN Shaoran. Reinforcement learning control for data center refrigeration systems [J]. Computer Engineering & Science, 2025, 47(3): 422-433.
[7]	YU Shirui, JIANG Chunmao. A cloud computing virtual machine scheduling strategy based on fuzzy reinforcement learning [J]. Computer Engineering & Science, 2025, 47(1): 56-65.
[8]	ZHANG Zheng, XIA Xiaoyun, CHEN Zefeng, XIANG Yi. A staged strategy incorporating reinforcement learning to solve the travelling thief problem [J]. Computer Engineering & Science, 2025, 47(1): 140-149.
[9]	ZHUANG Shu-xin, CHEN Yong-hong, HAO Yi-hang, WU Wei-wei, XU Xue-yong, WANG Wan-yuan. A population diversity-based robust policy generation method in adversarial game environments#br# [J]. Computer Engineering & Science, 2024, 46(6): 1081-1091.
[10]	DUAN Cheng-long, YUAN Jie, CHANG Qian-kun, ZHANG Ning-ning. Inverse reinforcement learning algorithm based on D2GA [J]. Computer Engineering & Science, 2024, 46(11): 2053-2062.
[11]	CAI Yu, GUAN Zheng, WANG Zeng-wen, WANG Xue, YANG Zhi-jun. Resource allocation algorithm for distinguished services in vehicular networks based on multi-agent deep reinforcement learning [J]. Computer Engineering & Science, 2024, 46(10): 1757-1764.
[12]	GU Ying-cheng, WEI Liu, JIANG Ning, CHENG Huan-yu, LIU Kai, SONG Yu, LIU Mei-zhao, TANG Lei, CHEN Yu, ZHANG Sheng. Edge server assignment for distributed interactive applications in edge environments [J]. Computer Engineering & Science, 2024, 46(10): 1748-1756.
[13]	ZENG Fan-feng, WANG Chun-zhen, LI Chen. An unsupervised video summarization algorithm based on deep and shallow feature fusion [J]. Computer Engineering & Science, 2023, 45(9): 1602-1610.
[14]	WANG Yang, CHEN Zhi-bin. A dynamic graph transformer model for solving CVRP [J]. Computer Engineering & Science, 2023, 45(5): 859-868.
[15]	PENG Kun-yan, YIN Xiang, LIU Xiao-zhu, LI Heng-yu. A strategy search method based on particle swarm optimization and deep reinforcement learning [J]. Computer Engineering & Science, 2023, 45(4): 718-725.

A multi-hop knowledge graph reasoning method based on reinforcement learning

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments