• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• • 上一篇    下一篇

基于正则化层的深度强化学习样本效率提升方法

孙昊, 王长鹏   

  1. (长安大学 理学院,陕西 西安  710064) 

Methods for improving sample efficiency in deep reinforcement learning based on regularization layers

SUN Hao, WANG Chang-peng   

  1. (School of Science , Chang'an University, Xi'an 710064) 

摘要: 随着深度神经网络的不断发展,强化学习与其结合形成了深度强化学习方法,显著提升了在高维复杂环境中处理决策问题的能力。然而,当前的深度强化学习算法仍面临样本效率低下的问题,导致智能体需要大量交互数据才能学习到有效策略。我们通过实验分析了样本效率低下的深层原因,并以此提出了一种混合缩放平滑正则化层,该方法直接作用于卷积编码器产生的特征表示。通过随机混合同一特征图的不同局部区域生成新的特征表示,该方法在潜在空间中引入了多尺度变化,从而在训练初期有效避免过拟合问题,进而提升样本效率。在多个仿真环境下的实验结果表明,我们的方法与现有先进方法的对比中表现出了优越性,样本效率有着18.13%的提升。该方法具有实现简单、通用性强等优点,为提升深度强化学习的样本效率提供了一条新的途径。

关键词: 深度强化学习, 样本效率, 正则化层, 强化学习

Abstract: With the advancement of deep neural networks, combining them with reinforcement learning has led to deep reinforcement learning, enhancing decision-making in complex environments. However, these algorithms often suffer from low sample efficiency, needing extensive data interaction to learn effective strategies. Through experimental analysis, we have investigated the underlying causes of low sample efficiency and, based on these findings, we propose a Mixed Scaling Smoothing Regularization Layer that directly operates on the feature representations produced by the convolutional encoder. By randomly mixing different local regions of the same feature map to generate new feature representations, this method introduces multi-scale variations in the latent space, effectively preventing overfitting in the early stages of training and improving sample efficiency. Experimental results in various simulated environments show that our method demonstrates superiority over existing state-of-the-art methods, with an 18.13% improvement in sample efficiency. This method is simple to implement, highly generalizable, and provides a new approach for improving sample efficiency in deep reinforcement learning.

Key words: Deep Reinforcement Learning, Sample Efficiency, Regularization Layer, Reinforcement Learning