基于多智能体深度强化学习的车联网区分业务资源分配算法

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (10): 1757-1764.

• 计算机网络与信息安全 • 上一篇下一篇

基于多智能体深度强化学习的车联网区分业务资源分配算法

蔡玉,官铮,王增文,王学,杨志军

(云南大学信息学院，云南昆明 650500)

收稿日期:2024-01-03 修回日期:2024-03-06 出版日期:2024-10-25 发布日期:2024-10-29
基金资助:
云南省应用基础研究计划(202201AT070167)；云南省专家工作站项目(202305AF150045)；云南省教育厅科研基金(2023Y0246)

Resource allocation algorithm for distinguished services in vehicular networks based on multi-agent deep reinforcement learning

CAI Yu,GUAN Zheng,WANG Zeng-wen,WANG Xue,YANG Zhi-jun

(School of Information Science & Engineering,Yunnan University,Kunming 650500,China)

Received:2024-01-03 Revised:2024-03-06 Online:2024-10-25 Published:2024-10-29

摘要/Abstract

摘要： 车联网产生大量网络连接和差异化数据，针对单个智能体难以在动态场景下收集信道状态信息并进行区分业务的资源分配和链路调度，提出了基于多智能体深度强化学习的车联网区分业务资源分配算法。该算法以实现紧急业务链路干扰最小化约束下，V2V链路数据包成功交付率和V2I链路总容量最大化为目标，利用深度强化学习算法进行多个蜂窝用户和设备到设备用户共存的单天线车载网络中，频谱分配和功率选择的策略优化。每个智能体都利用DQN进行训练，智能体间共同与通信环境交互，通过全局奖励函数实现智能体间的协作。仿真结果表明，高负载场景下，相较于传统随机分配算法，该算法的V2I链路总吞吐量增加了3.76 Mbps，V2V链路的数据包交付率提高了17.1%，紧急业务链路所受干扰相对于普通链路减少1.42 dB，实现紧急业务链路的优先级保障，有效提高了V2I链路和V2V链路的总传输容量。

关键词: 车联网, 频谱分配, 强化学习, 多智能体, 紧急业务

Abstract: The Internet of vehicles (IoV) generates a massive amount of network connections and diversified data. To address the challenge that a single agent struggles to collect channel state information and perform service-differentiated resource allocation and link scheduling in dynamic scenarios, a multi-agent deep reinforcement learning-based service-differentiated resource allocation method for IoV is proposed. This method aims to maximize the successful delivery rate of V2V link data packets and the total capacity of V2I links, under the constraint of minimizing interference to emergency service links. It employs deep reinforcement learning algorithms to optimize spectrum allocation and power selection strategies in a single-antenna vehicle-mounted network where multiple cellular users and device-to-device users coexist. Each agent is trained using deep Q-network(DQN), and they interact with the communication environment collectively, achieving coordination through a global reward function. Simulation results show that, in high-load scenarios, compared to traditional random allocation schemes, this scheme increases the total throughput of V2I links by 3.76 Mbps, improves the packet delivery rate of V2V links by 17.1%, and reduces the interference to emergency service links by 1.42 dB compared to ordinary links. This achieves priority guarantee for emergency service links and effectively enhances the overall transmission capacity of V2I and V2V links.

Key words: internet of vehicles, spectrum allocation, reinforcement learning, multi-agent, emergency services

蔡玉, 官铮, 王增文, 王学, 杨志军. 基于多智能体深度强化学习的车联网区分业务资源分配算法[J]. 计算机工程与科学, 2024, 46(10): 1757-1764.

CAI Yu, GUAN Zheng, WANG Zeng-wen, WANG Xue, YANG Zhi-jun. Resource allocation algorithm for distinguished services in vehicular networks based on multi-agent deep reinforcement learning[J]. Computer Engineering & Science, 2024, 46(10): 1757-1764.

[1]	邸剑, 万雪, 姜丽梅, . 基于随机对称搜索的进化强化学习算法[J]. 计算机工程与科学, 2025, 47(05): 912-920.
[2]	魏东, 贾宇辰, 韩少然. 数据中心制冷系统强化学习控制[J]. 计算机工程与科学, 2025, 47(03): 422-433.
[3]	章政, 夏小云, 陈泽丰, 向毅. 融合强化学习的分阶段策略求解旅行背包问题[J]. 计算机工程与科学, 2025, 47(01): 140-149.
[4]	余世瑞, 姜春茂. 基于模糊强化学习的云计算虚拟机调度策略[J]. 计算机工程与科学, 2025, 47(01): 56-65.
[5]	段成龙, 袁杰, 常乾坤, 张宁宁. 基于D2GA的逆强化学习算法[J]. 计算机工程与科学, 2024, 46(11): 2053-2062.
[6]	顾颖程, 魏柳, 姜宁, 程环宇, 刘凯, 宋玉, 刘梅招, 汤雷, 陈彧, 张胜. 边缘场景下面向分布式交互应用的服务器分配[J]. 计算机工程与科学, 2024, 46(10): 1748-1756.
[7]	庄述鑫, 陈永红, 郝一行, 吴巍炜, 徐学永, 王万元. 对抗环境中基于种群多样性的鲁棒策略生成方法[J]. 计算机工程与科学, 2024, 46(06): 1081-1091.
[8]	曾凡锋, 王春真, 李琛. 基于深浅层特征融合的无监督视频摘要算法研究[J]. 计算机工程与科学, 2023, 45(09): 1602-1610.
[9]	王扬, 陈智斌. 一种求解CVRP的动态图转换模型[J]. 计算机工程与科学, 2023, 45(05): 859-868.
[10]	彭坤彦, 尹翔, 刘笑竹, 李恒宇. 基于粒子群优化和深度强化学习的策略搜索方法[J]. 计算机工程与科学, 2023, 45(04): 718-725.
[11]	管延霞, 刘逊韵, 刘运韬, 谢旻, 徐新海. 面向多智能体博弈的并行蒙特卡洛树搜索算法研究[J]. 计算机工程与科学, 2022, 44(12): 2128-2133.
[12]	寇巧媛, 袁杰. 具有时变通信延迟的多智能体系统改进蜂拥控制[J]. 计算机工程与科学, 2022, 44(10): 1852-1860.
[13]	董鹏, 石怀斌, 史博元, 张其霄 . 基于多智能体的海外反恐运输投送模型研究[J]. 计算机工程与科学, 2022, 44(07): 1223-1231.
[14]	聂雷, 刘博, 李鹏, 何亨, . 基于多智能体Q学习的异构车载网络选择方法[J]. 计算机工程与科学, 2021, 43(05): 836-844.
[15]	王帅辉, 袁杰. 复合Petri网的主从式多智能体通信建模方法[J]. 计算机工程与科学, 2021, 43(02): 304-311.

基于多智能体深度强化学习的车联网区分业务资源分配算法

Resource allocation algorithm for distinguished services in vehicular networks based on multi-agent deep reinforcement learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

相关文章 15

编辑推荐

Metrics

本文评价