• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (12): 2200-2207.

• 论文 • 上一篇    下一篇

面向磁盘故障预测的机器学习方法比较

董勇,蒋艳凰,卢宇彤,周恩强   

  1. (1.国防科学技术大学计算机学院,湖南 长沙 410073;2. 高性能计算国家重点实验室,湖南 长沙 410073)
  • 收稿日期:2015-08-03 修回日期:2015-10-16 出版日期:2015-12-25 发布日期:2015-12-25
  • 基金资助:

    国家863计划资助项目(2012AA01A301);国家自然科学基金资助项目(61272141,61303068,61120106005)

Comparison of machine learning methods
for disk failure prediction   

DONG Yong,JIANG Yanhuang,LU Yutong,ZHOU Enqiang   

  1. (1.College of Computer,National University of Defense Technology,Changsha 410073;2.State Key Laboratory of High Performance Computing,Changsha 410073,China)
  • Received:2015-08-03 Revised:2015-10-16 Online:2015-12-25 Published:2015-12-25

摘要:

磁盘是保存数据的重要载体,提高磁盘的可靠性和数据可用性具有重要意义。现代磁盘普遍支持SMART协议,用来监控磁盘的内部工作状态。采用机器学习方法,分析磁盘的SMART信息,实现对磁盘故障的预测。所采用的机器学习方法包括反向神经网络、决策树、支持向量机以及简单贝叶斯,并采用实际磁盘SMART数据进行验证与分析。基于上述数据,对不同机器学习方法的有效性及其效果进行了对比。结果表明,决策树方法的预测率最好,支持向量机方法的误报率最低。

关键词: 磁盘, 故障预测, 机器学习

Abstract:

As disk is one of the most important data storage device, it is significant to improve disks’ reliability and data availability. Modern disks adopt the SMART protocol to monitor the internal operating status. We employ machine learning methods, including backpropagation neural networks, decision tree, supported vector machine and nave Bayes to analyze the SMART data of disks, which can predict disk failures. Real SMART data of disks are used in experiments to validate and analyze the effectiveness of those methods, and the effectiveness of different methods is compared. The results show that the decision tree method has best prediction rate while the supported vector machine method has the lowest false alarm rate.

Key words: disk;failure prediction;machine learning