K. Isupov, V. S. Knyazkov, “Multiple-precision matrix-vector multiplication on graphics processing units”, Program Systems: Theory and Applications, 2020, Volume 11, Issue 3,Pages <nobr>33

This article is cited in 2 papers

Hardware and Software for Supercomputers

Multiple-precision matrix-vector multiplication on graphics processing units

K. Isupov^a, V. S. Knyazkov^b

^a Vyatka State University
^b Penza State University

Abstract: We are considering a parallel implementation of matrix-vector multiplication (GEMV, Level 2 of the BLAS) for graphics processing units (GPUs) using multiple-precision arithmetic based on the residue number system. In our GEMV implementation, element-wise operations with multiple-precision vectors and matrices consist of several parts, each of which is calculated by a separate CUDA kernel. This feature eliminates branch divergence when performing sequential parts of multiple-precision operations and allows the full utilization of the GPU’s resources. An efficient data structure for storing arrays with multiple-precision entries provides a coalesced access pattern to the GPU global memory. We have performed a rounding error analysis and derived error bounds for the proposed GEMV implementation. Experimental results show the high efficiency of the proposed solution compared to existing high-precision packages deployed on GPU.

Key words and phrases: multiple-precision computations, BLAS, GEMV, parallel algorithms, CUDA, GPU, residue number system.

UDC: 004.222+004.272.25
BBK: З973:З972.1

Received: 29.04.2020
24.07.2020

DOI: 10.25209/2079-3316-2020-11-3-33-59