Workshop: Machine Learning on HPC Systems (MLHPCS)

View on GitHub

SmartPred: Unsupervised Hard Disk Failure Detection

presented by Philipp Rombach form the Institute for Machine Learning and Analytics (IMLA) at Offenburg University, Germany

Abstract

Due to the rapidly increasing storage consumption worldwide, as well as the expectation of continuous availability of information, the complexity of administration in today’s data centers is growing permanently. Integrated techniques for monitoring hard disks can increase the reliability of storage systems. However, these techniques often lack intelligent data analysis to perform predictive maintenance. To solve this problem, machine learning algorithms can be used to detect potential failures in advance and prevent them. In this paper, an unsupervised model for predicting hard disk failures based on Isolation Forest is proposed. Consequently, a method is presented that can deal with the highly imbalanced datasets, as the experiment on the Backblaze benchmark dataset demonstrates.

Slides

Paper pre-print

About the Speaker

Philipp just finished his Masters degree in Enterprise and IT Security at Offenburg Univesity. The presented work is part of his master thesis. He is now working as a security data analyst at Sick AG.