• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (11): 2053-2062.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

Inverse reinforcement learning algorithm based on D2GA

DUAN Cheng-long,YUAN Jie,CHANG Qian-kun,ZHANG Ning-ning   

  1. (School of Electrical Engineering,Xinjiang University,Urumqi 830017,China)
  • Received:2023-09-12 Revised:2024-02-20 Accepted:2024-11-25 Online:2024-11-25 Published:2024-11-27

Abstract: Aiming at the difficulty in obtaining expert demonstrations and the low utilization rate of generated samples in the traditional generative adversarial reinforcement learning,a double discriminator generative adversarial (D2GA) inverse reinforcement learning algorithm based on hindsight experience replay (HER) is proposed.In this algorithm,HER automatically synthesizes positive expert-like samples,and conducts adversarial training with negative samples generated by D2GA and reinforcement learning algorithm soft actor-critic (SAC).Based on the solved optimal reward function,SAC is used to solve the optimal strategy.The proposed D2GA algorithm is compared with the classical inverse reinforcement algorithm on four tasks in the Fetch environment.The results show that the success rate of D2GA in completing the task in relatively few rounds can reach ideal performance without available demonstration data,which is better than the current popular inverse reinforcement learning algorithm.


Key words: deep reinforcement learning, hindsight experience replay, inverse reinforcement learning, generative adversarial network