• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles    

A modified U-tree algorithm
based on effective instances
 

SONG Jiajia,WANG Zuowei   

  1. (School of Computer Science and Software Engineering,Tianjin Polytechnic University,Tianjin 300387,China)
  • Received:2017-09-12 Revised:2018-03-12 Online:2019-01-25 Published:2019-01-25

Abstract:

The traditional U-tree algorithm has achieved remarkable results in solving the problem of partially observable Markov decision process (POMDP), however, because of excessive random growth of fringe nodes, some problems such as large scale trees, large memory requirement and high computational complexity, still remain. Based on the original U-Tree algorithm, we classify the instances of the same leaf node which do the same action after obtaining the observation value, and propose an effective instance U-tree algorithm which extends fringe nodes based on effective instances. It greatly reduces computational scale to help the agent to learn faster and better. Simulation experiments are carried out on the classic 4×3 grid problem, and experimental results show that the algorithm outperforms the original u-Tree algorithm.

 

Key words: partially observable Markov decision process;reinforcement learning ;U-tree, Q-learning algorithm