Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (11): 2053-2062.
• Artificial Intelligence and Data Mining • Previous Articles Next Articles
DUAN Cheng-long,YUAN Jie,CHANG Qian-kun,ZHANG Ning-ning
Received:
Revised:
Accepted:
Online:
Published:
Abstract: Aiming at the difficulty in obtaining expert demonstrations and the low utilization rate of generated samples in the traditional generative adversarial reinforcement learning,a double discriminator generative adversarial (D2GA) inverse reinforcement learning algorithm based on hindsight experience replay (HER) is proposed.In this algorithm,HER automatically synthesizes positive expert-like samples,and conducts adversarial training with negative samples generated by D2GA and reinforcement learning algorithm soft actor-critic (SAC).Based on the solved optimal reward function,SAC is used to solve the optimal strategy.The proposed D2GA algorithm is compared with the classical inverse reinforcement algorithm on four tasks in the Fetch environment.The results show that the success rate of D2GA in completing the task in relatively few rounds can reach ideal performance without available demonstration data,which is better than the current popular inverse reinforcement learning algorithm.
Key words: deep reinforcement learning, hindsight experience replay, inverse reinforcement learning, generative adversarial network
DUAN Cheng-long, YUAN Jie, CHANG Qian-kun, ZHANG Ning-ning. Inverse reinforcement learning algorithm based on D2GA[J]. Computer Engineering & Science, 2024, 46(11): 2053-2062.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2024/V46/I11/2053