基于模糊强化学习的云计算虚拟机调度策略

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (01): 56-65.

基于模糊强化学习的云计算虚拟机调度策略

余世瑞1,姜春茂2

(1.哈尔滨师范大学计算机科学与信息工程学院，黑龙江哈尔滨 150025；
2.福建理工大学计算机科学与数学学院，福建福州 350118)

收稿日期:2023-07-13 修回日期:2024-02-25 接受日期:2025-01-25 出版日期:2025-01-25 发布日期:2025-01-18
基金资助:
黑龙江省自然科学基金（LH2020F031）

A cloud computing virtual machine scheduling strategy based on fuzzy reinforcement learning

YU Shirui1,JIANG Chunmao2

（1.College of Computer Science and Information Engineering,Harbin Normal University,Harbin 150025；
2.College of Computer Science and Mathematics,Fujian University of Technology,Fuzhou 350118,China)

Received:2023-07-13 Revised:2024-02-25 Accepted:2025-01-25 Online:2025-01-25 Published:2025-01-18

摘要/Abstract

摘要： 针对云计算数据中心中，低效的资源管理产生的高能耗问题，提出一种基于模糊的(Q- learning(λ))强化学习算法，通过处理虚拟机放置(VMP)问题来解决云计算数据中心的高能耗开销问题。将当前状态下的虚拟机数量以及物理机利用率作为输入状态传入模糊控制器，并与强化学习(RL)算法相结合来执行对应相关的策略。该算法能够动态地将相关虚拟机分配到所对应的物理服务器上并且能够减少虚拟机迁移次数，优化资源利用率，在满足用户服务级别协议(SLA)的同时降低能源消耗。该算法能够应对工作负载波动的情况，并在满足SLA的期望服务质量(QoS)需求的同时，提供合适的VM部署(初始或重新映射)。实验结果显示，与Q-learning、Q-learning(λ)、Greedy和PSO放置算法相比，基于模糊的Q-learning(λ)算法的能源消耗显著减少且具有更快的收敛速度和一定的实用价值。

关键词: 云计算, 虚拟机放置, 强化学习, 模糊系统

Abstract: Addressing the issue of high energy consumption resulting from inefficient resource management in cloud computing data centers, a fuzzy-based Q-learning(λ) reinforcement learning algorithm is proposed to tackle the high energy expenditure by addressing the virtual machine placement (VMP) problem. This algorithm takes the number of virtual machines in the current state and the utilization rate of physical hosts as input states, which are then fed into a fuzzy controller and combined with a reinforcement learning (RL) algorithm to execute corresponding strategies. This algorithm dynamically allocates relevant virtual machines to their corresponding physical servers, reducing the number of virtual machine migrations, optimizing resource utilization, and lowering energy consumption while satisfying user service level agreements (SLAs). This algorithm can handle fluctuating workload situations and provide appropriate VM deployment (initial or remap) while meeting the expected quality of service (QoS) requirements of SLAs. Experimental results show that compared to Q-learning, Q-learning(λ), Greedy and PSO placement algorithms, the fuzzy-based Q-learning(λ) algorithm significantly reduces energy consumption and has a faster convergence rate, demonstrating its practical value.

Key words: cloud computing, virtual machine placement, reinforcement learning, fuzzy system

余世瑞, 姜春茂. 基于模糊强化学习的云计算虚拟机调度策略[J]. 计算机工程与科学, 2025, 47(01): 56-65.

YU Shirui, JIANG Chunmao. A cloud computing virtual machine scheduling strategy based on fuzzy reinforcement learning[J]. Computer Engineering & Science, 2025, 47(01): 56-65.

编辑推荐

Metrics

阅读次数

全文

160

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	160

来源	本网站	其他网站

次数	89	71
比例	56%	44%

摘要

最新录用	在线预览	正式出版

0	0	91

	来源	本网站

	次数	91
	比例	100%

[1]	章政, 夏小云, 陈泽丰, 向毅. 融合强化学习的分阶段策略求解旅行背包问题[J]. 计算机工程与科学, 2025, 47(01): 140-149.
[2]	段成龙, 袁杰, 常乾坤, 张宁宁. 基于D2GA的逆强化学习算法[J]. 计算机工程与科学, 2024, 46(11): 2053-2062.
[3]	顾颖程, 魏柳, 姜宁, 程环宇, 刘凯, 宋玉, 刘梅招, 汤雷, 陈彧, 张胜. 边缘场景下面向分布式交互应用的服务器分配[J]. 计算机工程与科学, 2024, 46(10): 1748-1756.
[4]	蔡玉, 官铮, 王增文, 王学, 杨志军. 基于多智能体深度强化学习的车联网区分业务资源分配算法[J]. 计算机工程与科学, 2024, 46(10): 1757-1764.
[5]	庄述鑫, 陈永红, 郝一行, 吴巍炜, 徐学永, 王万元. 对抗环境中基于种群多样性的鲁棒策略生成方法[J]. 计算机工程与科学, 2024, 46(06): 1081-1091.
[6]	徐嘉, 张骥先, 王喆民, 刘林杰. 基于NUMA云计算架构的多资源分配可信拍卖机制[J]. 计算机工程与科学, 2024, 46(05): 761-775.
[7]	曾凡锋, 王春真, 李琛. 基于深浅层特征融合的无监督视频摘要算法研究[J]. 计算机工程与科学, 2023, 45(09): 1602-1610.
[8]	王扬, 陈智斌. 一种求解CVRP的动态图转换模型[J]. 计算机工程与科学, 2023, 45(05): 859-868.
[9]	彭坤彦, 尹翔, 刘笑竹, 李恒宇. 基于粒子群优化和深度强化学习的策略搜索方法[J]. 计算机工程与科学, 2023, 45(04): 718-725.
[10]	童钊, 叶锋, 刘碧篮, 邓小妹, 梅晶, 刘宏. 移动边缘计算中多约束下的任务卸载和资源分配算法[J]. 计算机工程与科学, 2020, 42(10高性能专刊): 1869-1879.
[11]	蔡钺, 游进国, 丁家满. 基于近端策略优化与对抗学习的对话生成[J]. 计算机工程与科学, 2020, 42(09): 1680-1689.
[12]	官蕊, 丁家满, 贾连印, 游进国, 姜瑛, . 基于强化学习的多样性文档排序算法[J]. 计算机工程与科学, 2020, 42(09): 1697-1703.
[13]	韩虎, 孙天岳, 赵启涛. 引入自编码机制对抗网络的文本生成模型[J]. 计算机工程与科学, 2020, 42(09): 1704-1710.
[14]	林涛, 冯竞凯, 郝章肖, 黄少群. 基于组合预测模型的云计算资源负载预测研究[J]. 计算机工程与科学, 2020, 42(07): 1168-1173.
[15]	周碧莹1，王爱平1，费长江2，虞万荣2，赵宝康2. 基于强化学习的卫星网络资源调度机制[J]. 计算机工程与科学, 2019, 41(12): 2134-2142.