Abstract:
The KernelGen project (http://kernelgen.org/) aims to develop Fortran and C compilers
based on the state-of-art open-source technologies for automatic GPU kernels generation from unmodified CPU source code, significantly improving the code porting experiences. Parallelism detection is based on LLVM/Polly and CLooG, extended with mapping of loops onto GPU compute grid, and assisted with runtime alias analysis. PTX assembly code is generated with NVPTX backend. Thanks to integration with GCC frontend by means of DragonEgg plugin, and customized linker, KernelGen features full GCC compatibility, and is able to compile complex applications into hybrid binaries containing both CPU and GPU-enabled executables. In addition to more robust parallelism detection, test kernels produced by KernelGen are up to 60 % faster than generated by PGI compiler for kernels source with manually inserted OpenACC directives.