A strategy search method based on particle swarm optimization and deep reinforcement learning

Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (04): 718-725.

• Artificial Intelligence and Data Mining • Previous Articles Next Articles

A strategy search method based on particle swarm optimization and deep reinforcement learning

PENG Kun-yan,YIN Xiang,LIU Xiao-zhu,LI Heng-yu

(School of Information Engineering(Artificial Intelligence),Yangzhou University,Yangzhou 225117,China)

Received:2021-07-08 Revised:2021-11-15 Accepted:2023-04-25 Online:2023-04-25 Published:2023-04-13

Abstract

Abstract: Deep Reinforcement Learning (DRL) algorithm is a popular policy search method and has been successfully applied to a series of challenging control tasks. However, DRL is difficult to be applied to large-scale practical problems due to its difficulty in dealing with reward sparseness, lack of effective exploration and fragile convergence sensitive to hyperparameters. Particle Swarm Optimization (PSO) is an evolutionary optimization method, which uses the cumulative rewards of the entire episode as the fitness value and is insensitive to the environment with sparse rewards. Moreover, this method also has population-based diversification exploration and stable convergence, but the sample efficiency is low. In this paper, PSO and DRL based on policy gradient are combined. DRL trains the policies with the lowest cumulative rewards in the population through a variety of data provided by the PSO population, and every time the policies with improved cumulative rewards after training is inserted into the PSO population to enhance the information exchange between DRL and PSO population. This algorithm, called PSO-RL, can improve the sample efficiency of PSO and improve the performance and stability of DRL algorithm. Experiments on the challenging continuous control task of the PyBullet module show that PSO-RL performs better than both DRL and the evolutionary reinforcement learning algorithm.

Key words: particle swarm optimization, strategy search, deep reinforcement learning, policy gradient, reinforcement learning

PENG Kun-yan, YIN Xiang, LIU Xiao-zhu, LI Heng-yu. A strategy search method based on particle swarm optimization and deep reinforcement learning[J]. Computer Engineering & Science, 2023, 45(04): 718-725.

Metrics

Viewed

Full text

517

HTTP500 内部服务器出错

http500错误

http 500内部服务器错误，请与管理员联系。

请尝试以下操作：

·打开主页，然后查找指向您感兴趣信息的链接。
·单击后退链接，尝试其他链接。

From	Others	local

Times	97	420
Rate	19%	81%

Abstract

417

Just accepted	Online first	Issue

0	0	417

	From	local

	Times	418
	Rate	100%

[1]	DUAN Cheng-long, YUAN Jie, CHANG Qian-kun, ZHANG Ning-ning. Inverse reinforcement learning algorithm based on D2GA [J]. Computer Engineering & Science, 2024, 46(11): 2053-2062.
[2]	SHEN Xiao-ning, XU Ji-yong, MAO Ming-jian, CHEN Wen-yan, SONG Li-yan, . Dynamic agile software project scheduling using dual-index group learning particle swarm optimization [J]. Computer Engineering & Science, 2024, 46(10): 1793-1806.
[3]	ZHUANG Shu-xin, CHEN Yong-hong, HAO Yi-hang, WU Wei-wei, XU Xue-yong, WANG Wan-yuan. A population diversity-based robust policy generation method in adversarial game environments#br# [J]. Computer Engineering & Science, 2024, 46(06): 1081-1091.
[4]	XU Wen-jun, WANG Xi-huai. A particle swarm optimization algorithm based on variable-scale black hole and population migration [J]. Computer Engineering & Science, 2023, 45(11): 2036-2046.
[5]	ZHANG Wen-ning, ZHOU Qing-lei, JIAO Chong-yang, MEI Liang. A particle swarm optimization algorithm with centroid opposition-based learning and simplex search [J]. Computer Engineering & Science, 2023, 45(09): 1629-1638.
[6]	WANG Yang, CHEN Zhi-bin. A dynamic graph transformer model for solving CVRP [J]. Computer Engineering & Science, 2023, 45(05): 859-868.
[7]	WANG Lin, WANG Yan-li, AN Ze-yuan. Echo state networks with improved particle swarm optimization algorithm for electricity demand forecasting [J]. Computer Engineering & Science, 2022, 44(08): 1457-1466.
[8]	SHEN Xiao-ning, PAN Hong-li, CHEN Qing-zhou, YOU Xuan, HUANG Yao. Application of particle swarm optimization with heuristic information in low-carbon TSP [J]. Computer Engineering & Science, 2022, 44(06): 1114-1125.
[9]	SUN Bao-gui, CHE Wen-gang, LIAO Jiang-fu, . An improved KNN retrieval algorithm of case-based reasoning [J]. Computer Engineering & Science, 2021, 43(12): 2263-2271.
[10]	ZHANG Jing, WEI Miao, . WSN area coverage optimization based on Delaunay triangulation strategy [J]. Computer Engineering & Science, 2021, 43(11): 1944-1951.
[11]	TONG Zhao, YE Feng, LIU Bi-lan, DENG Xiao-mei, MEI Jing, LIU Hong. A task offloading and resource allocation algorithm under multiple constraints in mobile edge computing [J]. Computer Engineering & Science, 2020, 42(10高性能专刊): 1869-1879.
[12]	WANG Yi-hu, WANG Si-ming. UAV path planning based on improved particle swarm optimization [J]. Computer Engineering & Science, 2020, 42(09): 1690-1696.
[13]	GAO Hai-jun, PAN Da-zhi. A multi-objective particle swarm optimization algorithm with star structure to solve the multi-modal multi-objective problem [J]. Computer Engineering & Science, 2020, 42(08): 1472-1481.
[14]	WANG Yong-qi, JIANG Xiao-xiao. Robot path planning using a hybrid grey wolf optimization algorithm [J]. Computer Engineering & Science, 2020, 42(07): 1294-1301.
[15]	YAN Pan-pan,YU Hai-zhen,SHI Xu-hua,WAN Kai. Pareto dominance based area and power consumption optimization of MPRM circuit [J]. Computer Engineering & Science, 2020, 42(04): 596-602.

A strategy search method based on particle swarm optimization and deep reinforcement learning

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles 0

Metrics

Comments