RUS  ENG
Full version
JOURNALS // Numerical methods and programming // Archive

Num. Meth. Prog., 2013 Volume 14, Issue 2, Pages 11–17 (Mi vmp146)

Программирование

Automating the location of errors and inefficiencies in parallel programs

A. S. Antonov, Vad. V. Voevodin, S. A. Zhumatii, D. A. Nikitenko, K. S. Stefanov, P. A. Shvets

M.V. Lomonosov Moscow State University, Research Computing Center

Abstract: The problem of efficient utilization of available computational resources becomes much more important with the supercomputer applications scaling fast. Excessive computations due to inefficient algorithm implementations, unreasonably numerous test runs, and peculiarities of software and system architecture untaken into consideration – these and many other matters together lead to the undue usage of computational resources, to the increasing development time, and to a higher cost of getting the result. There are various ways for the automation of efficiency analysis and location of errors in parallel applications. A complex approach to the efficiency study of application runs is proposed in this paper. This work was supported by the Ministry of Education and Science of the Russian Federation (contract 14.514.11.4062).

Keywords: supercomputer; performance; efficiency study; parallel computing; parallel programs; dynamic program characteristics; high performance computing; profiling; monitoring; supercomputer center.

UDC: 004.021

Received: 19.03.2013



© Steklov Math. Inst. of RAS, 2024