Abstract:
The article deals with methods for reducing the complexity of approximating models. Probabilistic substantiation of distillation and privileged teaching methods is proposed. General conclusions are given for an arbitrary parametric function with a predetermined structure. A theoretical basis is demonstrated for the special cases of linear and logistic regression. The analysis of the considered models is carried out in a computational experiment on synthetic samples and real data. The FashionMNIST and Twitter Sentiment Analysis samples are considered as real data.
Keywords:model selection, Bayesian inference, model distillation, learning with privileged information.