Abstract:
When processing high-dimensional matrices with an irregular structure, the realperformance of cluster multiprocessor computing systems (MCS) is low and even with the use of special processing methods does not exceed 30%. To effectively process large matrices with an irregular structure, it is possible to use reconfigurable computing systems (RCS), for which the authors proposed a method for processing large sparse unstructured (LSU) matrices, due to which real performance was achieved close to 50% of the peak. The article describes a modification of the developed method for processing LSU matrices, which is characterized by parallel processing of non-zero row elements and allows doubling the speed of the computing structure with a slight increase in the occupied hardware resource. The modified method of processing LSU matrices on an RCS provides real performance close to 90% of the peak, which significantly exceeds the known results of solving similar problems for cluster MCS. Comparison of the results of solving the problem of ranking web pages using the PageRank algorithm obtained on the “Arcturus” RCS and the Fugaku supercomputer, as well as the results of solving the SLAE using the Jacobi method on the “Arcturus” RCS and the graphics accelerator NVidia Tesla K40 confirms the theoretical conclusions.
Keywords:reconfigurable computing systems, high-performance computing systems, sparse matrix, large unstructured matrix, sparse matrix format, discrete-event transformation, intensity balance dataflow, parallelization of calculations, parallelization over non-zero elements.