Abstract:
This paper investigates the problem of deep learning model optimization. The authors propose a method to control model complexity. The minimum description length is interpreted as the complexity of the model. It acts as the minimal amount of information that is required to transfer information about the model and the dataset. The proposed method is based on representation of a deep learning model. The authors propose a form of a hypernet using the Bayesian inference. A hypernet is a model that generates parameters of an optimal model. The authors introduce probabilistic assumptions about the distribution of parameters of the deep learning model. The paper suggests maximizing the evidence lower bound of the Bayesian model validity. The authors consider the evidence bound as a conditional value that depends on the required model complexity. The authors analyze this method in computational experiments on the MNIST dataset.
Keywords:model variational optimization, hypernets, deep learning, neural networks, Bayesian inference, model complexity control.