RUS  ENG
Full version
JOURNALS // Numerical methods and programming // Archive

Num. Meth. Prog., 2023 Volume 24, Issue 4, Pages 339–351 (Mi vmp1093)

Parallel software tools and technologies

Shared memory based MPI Reduce and Bcast algorithms

A. A. Romanyutaa, M. G. Kurnosovbc

a Siberian State University of Telecommunications and Informatics, Novosibirsk
b Siberian Academy of Telecommunications and Informatics
c Rzhanov Institute of Semiconductor Physics, Siberian Branch of Russian Academy of Sciences, Novosibirsk

Abstract: Algorithms for implementing collective operations MPI_Bcast, MPI_Reduce, MPI_Allreduce using shared memory of multiprocessor servers are proposed. The algorithms create a shared memory segment and a system of queues in it, through which message blocks are transmitted. The software implementation is based on the Open MPI library as an isolated coll/sharm component. Unlike existing algorithms, interaction with the queuing system is organized with spinlock and focused on reducing the number of barrier synchronizations and atomic operations. When conducting experiments on a server with x86–64 architecture for the MPI_Bcast operation, the largest reduction in time was obtained by 6.5 times (85% less) and MPI_Reduce by 3.3 times (70% less) compared to the implementation in the coll/tuned component of the Open MPI library. Recommendations on the use of algorithms for different message sizes are suggested.

Keywords: Bcast; Reduce; Allreduce; collective operations; MPI; computer systems.

UDC: 004.724.3

Received: 24.07.2023

DOI: 10.26089/NumMet.v24r424



© Steklov Math. Inst. of RAS, 2025