RUS  ENG
Full version
JOURNALS // Computing, Telecommunication and Control // Archive

St. Petersburg Polytechnical University Journal. Computer Science. Telecommunication and Control Sys, 2019 Volume 12, Issue 4, Pages 145–158 (Mi ntitu256)

System Analysis and Control

A survey of the approaches to storage systems fault detection

M. B. Uspenskiy

Peter the Great St. Petersburg Polytechnic University

Abstract: In present paper, we have carried out a comparative analysis of existing software used for health monitoring in enterprise-level storage systems, described commonly used approaches to monitoring data collection, processing and storage, fault detection methods. Based on this analysis we proposed criteria for monitoring software classification and comparison, generalized monitoring software architecture, its modules and module interaction. We also carried out a survey of the recent publications dedicated to anomalies detection, fault diagnosis in a field of data storage and computing systems, and described commonly used algorithms, including clusterization and classification methods, statistical analysis, SVM, isolated forest, artificial immune system, invariant networks.

Keywords: anomaly detection, machine learning, fault diagnosis, storage system, root cause analysis.

UDC: 004.021

Received: 10.05.2019

DOI: 10.18721/JCSTCS.12412



© Steklov Math. Inst. of RAS, 2024