RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2024 Volume 15, Issue 1, Pages 63–94 (Mi ps440)

Hardware, software and distributed supercomputer systems

Justification of methods for accelerating iterative loops nests

E. A. Metelitsa

Southern Federal University, Rostov-on-Don, Russia

Abstract: The acceleration of iterative algorithms, found in solving problems of mathematical physics, mathematical modeling, and image processing, is considered. In the software implementation of these algorithms, there are nested loops (sections of the program that consist of nested loops). These loop nests can be accelerated by combination of optimizing transformations, including tiling, hyperplane method, and parallelization on shared memory. The equivalence of this combination of program transformations is substantiated.
A method for changing the order of tile traversal is proposed and justified. The method provides acceleration by increasing data readings from registers instead of slower memory. Considering this method, a formula for calculating optimal tile sizes is obtained.
The combination of transformations presented in this article results in an acceleration that is 1.4 times greater than the well-known optimization algorithm implemented in PLUTO. In some cases using an 8-core processor, numerical experiments show a significant increase in speed compared to the original sequential algorithms. The findings of this article can be applied to manual and automatic program optimization.

Key words and phrases: tiling, wavefront, parallelization, shared memory, iterative stencil loops.

UDC: 004.424.22, 004.312.46
BBK: 32.972.11

MSC: 68W10; 68N20

Received: 21.01.2024
Accepted: 15.02.2023

DOI: 10.25209/2079-3316-2024-15-1-63-94



© Steklov Math. Inst. of RAS, 2024