Abstract:
In this paper, we consider the construction of efficient finite element algorithms on three-dimensional unstructured meshes that take into account the complex parallel synchronization processes, the memory distribution problems and data storage. A layer-by-layer partitioning of the meshes into subdomains without branching internal boundaries is proposed to simplify the access to independent data and parallel computing at different stages of the finite element problem solving on unstructured meshes in multiply connected domains. The predictive capacity of the time efficiency and resource intensity for the proposed algorithmic solutions is analyzed. The analysis of the resource efficiency of the algorithms is given for the element-by-element scheme for forming and solving the system of linear algebraic equations of the finite element method. It is shown that the low arithmetic intensity of the algorithms considered results in the fact that their performance is limited by the bandwidth of the memory subsystem rather than by the processors'performance. The graphic memory has a larger bandwidth than the random-access memory. This allows a significant increase in the performance of the algorithm on GPU.