Abstract:
Matrix inversion is widely used in numerical methods, such as linear solvers, preconditioning for linear system, domain decomposition, digital image processing, etc. High-performance implementation of matrix inversion requires efficient matrix storage formats and optimal distribution of computations between computing devices. In this paper, we study the performance of traditional matrix inversion algorithms, such as LU-factorization and Gauss-Jordan, as well as the conjugate gradient method and the Sherman - Morrison formula. In the last two algorithms, matrix-vector products and scalar products are efficiently executed on multicore/manycore processors. We compare the performance of the algorithms on hybrid multi-CPU multi-GPU platforms, using the matrices from well-know test suites and from the numerical simulation of wrap spring.