• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于Q-learning的不确定环境BDI Agent最优策略规划研究

万谦1,2,刘玮1,2,徐龙龙1,2,郭竞知1,2   

  1. (1.武汉工程大学计算机科学与工程学院,湖北 武汉 430073;
    2.智能机器人湖北省重点实验室,湖北 武汉 430073)
  • 收稿日期:2018-06-05 修回日期:2018-08-13 出版日期:2019-01-25 发布日期:2019-01-25
  • 基金资助:

    国家自然科学基金(61502355);武汉工程大学第九届研究生教育创新基金(CX2017068)

Optimal strategy planning of BDI agent based on
 Q-learning in uncertain environments

WAN Qian1,2,LIU Wei1,2,XU Longlong1,2,GUO Jingzhi1,2   

  1. (1.School of Computer Science and Engineering,Wuhan Institute of Technology,Wuhan 430073;
    2.Hubei Provincial Key Laboratory of Intelligent Robot,Wuhan 430073,China)

     
  • Received:2018-06-05 Revised:2018-08-13 Online:2019-01-25 Published:2019-01-25

摘要:

BDI模型能够很好地解决在特定环境下的Agent的推理和决策问题,但在动态和不确定环境下缺少决策和学习的能力。强化学习解决了Agent在未知环境下的决策问题,却缺少BDI模型中的规则描述和逻辑推理。针对BDI在未知和动态环境下的策略规划问题,提出基于强化学习Q-learning算法来实现BDI Agent学习和规划的方法,并针对BDI的实现模型ASL的决策机制做出了改进,最后在ASL的仿真平台Jason上建立了迷宫的仿真,仿真实验表明,在加入Qlearning学习机制后的新的ASL系统中,Agent在不确定环境下依然可以完成任务。

关键词: BDI Agent, 强化学习, Q-learning, ASL, Jason, 规划

Abstract:

The belief-desire-intention (BDI) model can solve the problem of reasoning and decision-making of agents in a particular environment, but lacks the ability of decision-making and learning in dynamic and uncertain environments. Reinforcement learning solves the decision-making problem of agent in unknown environments, but lacks the rule description and logical reasoning of the BDI model. Aiming at the strategic planning problem of the BDI in the unknown and dynamic environment, we propose an optimal strategy planning method based on Q-learning algorithm of reinforcement learning. And we make  improvement for the decision-making mechanism on the implementation model of the BDI—agent speak language (ASL). Finally, the simulation of the maze on the ASL simulation platform Jason proves the feasibility of this method, and the new agent model can fulfill tasks in uncertain environments.
 

Key words: BDI agent, reinforcement learning, Q-learning, ASL, Jason, planning