A. V. Zakirov, V. D. Levchenko, A. Yu. Perepelkina, Yasunari Zempo, “High performance FDTD code implementation for GPGPU supercomputers”, Keldysh Institute preprints, 2016,044, 22 pp.

This article is cited in 5 papers

High performance FDTD code implementation for GPGPU supercomputers

A. V. Zakirov, V. D. Levchenko, A. Yu. Perepelkina, Yasunari Zempo

Abstract: An implementation of FDTD (Finite Difference Time Domain) method for solution of optical and other electrodynamic problems of high computational cost is described. The implementation is based on LRnLA (Locally Recursive non-Locally Asynchronous) algorithm DiamondTorre, which is developed specifically for GPGPU (General Purpose Graphical Processing Unit) hardware. The specifics of the DiamondTorre algorithms for staggered grid (Yee cell) and many-GPU devices are shown. The algorithm is implemented in software for real physics calculation with the use of CUDA, OpenMP, MPI technologies. The software performance limits are estimated through algorithms parameters and computer model of TSUBAME2.5. The real performance is tested on one GPU device, as well as on many-GPU cluster with strong and weak scaling tests. The performance of up to $0.65\cdot10^{12}$ cell updates per second for 3D domain with $0.3\cdot10^{12}$ Yee cells total is achieved.

UDC: 519.688

Language: English

DOI: 10.20948/prepr-2016-44-e