Abstract:
It is describes a CFD algorithm for modeling of compressible turbulent flows on unstructured hybrid meshes and its portable implementation for heterogeneous computations. Navier–Stokes equations are discretized using a cell-centered high-order finite-volume method. This parallel software can run on hybrid clusters of various architectures. Parallel computations are implemented by means of MPI, OpenMP, CUDA, OpenCL for the sake of comparison. The portable OpenCL-based version can engage multi-core CPUs, graphics processing units (GPU) of AMD and NVIDIA, many-core coprocessors Intel Xeon Phi. MPI and host-device data transmission is overlapped with computations on accelerators. Parallel efficiency and performance is studied in detail on different systems with wide range of accelerator types. Tests were performed for up to 260 GPUs.