Abstract:
Parallel processing of unstructured data пшмут in a graph-like form can be a severe computational challenge because of significant overheads caused by the irregular nature of graph algorithms and the hardware latency of intensive data access. The GPU implementation of the load balancing method that allows one to dramatically improve the parallel breadth-first search algorithm compared to its sequential analog on CPU is considered. This work was partially supported by the Russian Foundation for Basic Research (project 14-07-00435) and by UB RAS (projects 12-P-1-1029 and RCP-13-P18). Numerical experiments were performed using the “Uran” supercomputer installed at IMM UB RAS. This paper was recommended for publication by the Program Committee of the International Scientific Conference “Scientific service in the Internet: all aspects of parallelism” (http://agora.guru.ru/abrau2013).