RUS  ENG
Full version
JOURNALS // Numerical methods and programming // Archive

Num. Meth. Prog., 2021 Volume 22, Issue 3, Pages 230–238 (Mi vmp1036)

Parallel software tools and technologies

Preprocessing of system monitoring data for workload analysis of hpc systems

M. I. Martyshov, D. A. Nikitenko

Lomonosov Moscow State University

Abstract: HPC systems are complex in architecture and contain millions of components. To ensure reliable operation and efficient output, functioning of most subsystems should be supervised. This is done on the basis of collected data from various logging and monitoring systems. This means that different data sources are used, and accordingly, data analysis can face multiple issues processing this data. Some of the data subsets can be incorrect due to the malfunctioning of used sensors, monitoring system data aggregation errors, etc. This is why it is crucial to preprocess such monitoring data before analyzing it, taking into the consideration the analysis goals. The aim of this paper is, being based on the MSU HPC Center monitoring data, to propose an approach to data preprocessing of HPC monitoring systems, giving some real life examples of issues that may be faced, and recommendations for further analysis of similar datasets.

Keywords: supercomputing, supercomputers, system monitoring data analysis, system monitoring data cleaning, system monitoring data reduction.

Received: 22.08.2021

DOI: 10.26089/NumMet.v22r314



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024