RUS  ENG
Full version
JOURNALS // Numerical methods and programming // Archive

Num. Meth. Prog., 2023 Volume 24, Issue 1, Pages 1–9 (Mi vmp1070)

Parallel software tools and technologies

Bottlenecks in organizing the workflows of large HPC centers

Dmitry A. Nikitenko

Lomonosov Moscow State University, Research Computing Center, Moscow, Russia

Abstract: Effective output from data centers are determined by many complementary factors. Often, attention is paid to only a few, at first glance, the most significant of them. For example, this is the efficiency of the scheduler, the efficiency of resource utilization by user tasks. At the same time, a more general view of the problem is often missed: the level at which the interconnection of work processes in the HPC center is determined, the organization of effective work as a whole. missions at this stage can negate any subtle optimizations at a low level. This paper provides a scheme for describing workflows in the supercomputer center and analyzes the experience of large HPC facilities in identifying the bottlenecks in this chain. A software implementation option that gives the possibility of optimizing the organization of work at all stages is also proposed in the form of a support system for the functioning of the HPC site.

Keywords: supercomputing, provision of computing resources, use of computing resources, workflows at supercomputer center, shared research facilities, provision of computing services.

Received: 30.12.2022
Accepted: 09.01.2023

Language: English

DOI: 10.26089/NumMet.v24r101



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024