Abstract:
In this paper, based on an image classification problem and wavelet family $CDF-9/7$, it is shown how to incorporate discrete wavelet transform into a computer vision model, while maintaining the ability of its training with the backpropagation method. A convolutional wavelet block, that extracts features at different levels of decomposition of the incoming signal, is proposed and successfully integrated into a set of neural network models. The blocks implemented allow to reduce the original model size by $30-40 \%$, while maintaining comparable quality in terms of metric. An effective method for evaluation of discrete wavelet transform on graphics processing unit with lifting scheme is presented. The implementation of wavelet blocks uses element-wise operations of additions and multiplications, thus allowing a simple export of a trained model into one of desired formats for running on new data. $ResNetV2-50$, $MobileNetV2$ and $EfficientNetV2-B0$ architectures are used as the basis models. A new dataset, which is based on a set of categories of LSUN dataset, is constructed for conducting experiments.
Keywords:neural networks; deep learning; wavelets; discrete wavelet transform; image classification