Seminars: V. G. Spokoiny, Estimation and Inference for smooth DNNs. Blessing of dimension

SEMINARS


Mathematical Foundations of Artificial Intelligence September 3, 2025 17:00, Moscow, Steklov Mathematical Institute + Kontur Talk

Estimation and Inference for smooth DNNs. Blessing of dimension V. G. Spokoiny^abc ^a National Research University Higher School of Economics, Moscow ^b Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow ^c Weierstrass Institute for Applied Analysis and Stochastics, Berlin
https://vk.com/video-222947497_456239124 Abstract: Nonlinear regression problem is one of the most popular and important statistical tasks. The first methods like least squares estimation go back to Gauss and Legendre. Recent models and developments in statistics and machine learning like Deep Neuronal Networks (DNN) or nonlinear PDE stimulate new research in this direction which has to address the important issues and challenges of modern statistical inference such as huge complexity and parameter dimension of the model, limited sample size, lack of convexity and identifiability, among many others. This paper offers a general approach to studying a nonlinear regression problem based on the notion of effective dimension. First, a special case of models with stochastically linear structure (SLS) is studied. The results provide finite sample expansions for the loss of the penalized maximum likelihood estimation (MLE). The leading term of such expansions as well as the corresponding remainder are given via the effective dimension and the effective sample size. The obtained expansions can be used to obtain sharp risk bounds and for statistical inference. Despite generality, all the presented bounds are nearly sharp and the classical asymptotic results can be obtained as simple corollaries. Although the basic SLS assumptions are not fulfilled for nonlinear smooth regression, we explain how the stochastic linearity can be achieved by extending the parameter space. The obtained general results are specified to nonlinear smooth regression and to a DNN with one and many hidden layer. Language: English