Abstract:
A fast algorithm for computing 2D convolutions based on the Nussbaumer polynomial transforms is considered. Its efficient implementation is proposed with the use of Intel AVX SIMD instructions. It is shown that, for a limited range of convolution kernels, the performance increases by 50% in comparison with the direct algorithm and with the method of fast convolution based on the fast Fourier transform implemented in the Intel IPP library.
Keywords:2D convolution, polynomial transform, fast algorithms.