• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 高性能计算 • 上一篇    下一篇

基于FP-Growth改进算法的云服务器故障数据分析

何望1,2,林果园1,2   

  1. (1.中国矿业大学计算机科学与技术学院,江苏 徐州 221000;2.矿山数字化教育部工程研究中心,江苏 徐州 221000)
     
  • 收稿日期:2019-07-18 修回日期:2020-01-03 出版日期:2020-05-25 发布日期:2020-05-25
  • 基金资助:

    中央高校基本科研业务费专项资金(2017XKQY079)

Analysis of cloud server fault data based
 on improved FP-Growth algorithm

HE Wang1,2,LIN Guo-yuan1,2   

  1. (1.School of Computer Science and Technology,China University of Mining and Technology,Xuzhou 221000;
    (2.Digitization of Mine,Engineering Research Center of Ministry of Education of China,Xuzhou 221000,China)
  • Received:2019-07-18 Revised:2020-01-03 Online:2020-05-25 Published:2020-05-25

摘要:

针对云服务器使用过程中参数异常的问题,介绍了云服务器的参数数据获取、数据清洗整理和有效分析过程。针对现有频繁模式增长(FP-Growth)算法中存在的条件FP-tree构建过程过于冗余以及数据量级越大处理效率越低的问题,提出了一种改进的FP-Growth算法,引入数组标记策略,每个FP-tree节点只保留指向父节点的指针。改进算法在挖掘过程中无需生成条件FP-tree,减少了时空消耗。实验结果表明,改进后的FP-Growth并行算法能够有效地提高云平台虚拟机异常数据的关联分析效率,并且改进算法也适用于较大规模数据集的数据挖掘工作。
 

关键词: 云服务器, 故障分析, FP-Growth算法, 数据挖掘

Abstract:

In order to analyze the problem of abnormal parameters in the process of using the cloud server, the process of parameter data acquisition, data cleaning, and effective analysis of the cloud server is introduced. Aiming at the problems that the conditional FP-tree construction process is too redundant and the larger amount of data causes lower processing efficiency in the existing FP-Growth algorithm, an improved FP-Growth algorithm is proposed. It introduces the array tagging strategy, and each FP-tree node retains only pointers to the parent node. It does not need to generate a conditional FP-tree during the mining process, thus reducing time and space consumption. Experimental results show that the improved FP-Growth parallel algorithm can effectively improve the correlation analysis efficiency of abnormal data of cloud platform virtual machines, and is also suitable for data mining of large-scale data sets.
 

Key words: cloud server, fault analysis, FP-Growth algorithm, data mining