Convergence of a multilayer perceptron to histogram-based Bayesian regression
N. A. Eliseeva,
A. I. Perminova,
D. Yu. Turdakovab a Ivannikov Institute for System Programming of the Russian Academy of Sciences
b Research Center for Trusted Artificial Intelligence, ISP RAS
Abstract:
The problem of improving the interpretability and theoretical justification of decisions made by a Bayesian classifier when approximating empirical data using a multilayer perceptron is under consideration. Histogram-based regression preserves transparency and a clear statistical interpretation, but is limited by memory requirements (
$O(n)$) and low scalability, whereas a multilayer perceptron provides a memory-efficient representation (
$O(1)$) and high computational efficiency at the cost of limited interpretability.
Special attention is paid to a unary training scheme, in which the training set consists of examples from a single target class and additional background points uniformly distributed over a compact subset of the feature space. This approach makes it possible to process each class in isolation and to implement a rejection rule outside the support of the data distribution, thereby increasing the reliability of the model.
It is proposed to interpret the output of the perceptron as a consistent analogue of histogram partitioning induced by the cells of linearity of the perceptron. It is shown that, under natural regularity conditions and controlled growth of the architecture, the output function of a multilayer perceptron is consistent and asymptotically equivalent to the histogram estimator. Theoretical consistency is rigorously proved for the case of a fixed first layer, while numerical experiments confirm the applicability of the results to models with all layers being trainable.
Thus, the histogram-based interpretation provides statistical verification of the correctness of perceptron approximation and contributes to increased trust in classification decisions within the unary model.
Bibliography: 15 references.
Keywords:
multilayer perceptron, histogram-based regression, piecewise linear activation functions, Bayesian classifier, consistency, asymptotic equivalence, VC dimension, random hyperplanes, unary classification.
UDC:
004.8+
519.6
MSC: Primary
62H30,
68T10; Secondary
62G07,
68T07 Received: 15.09.2025
DOI:
10.4213/rm10273