RUS  ENG
Full version
SEMINARS

Actual Problems of Applied Mathematics
June 11, 2021, Novosibirsk


Stochastic gradient descent and data analysis

A. V. Gasnikov

Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow Region


https://www.youtube.com/watch?v=5xC4rpbNKBU&ab_channel=%D0%9C%D0%9C%D0%A4%D0%9D%D0%93%D0%A3

Abstract: One of the most popular topics at the intersection of data analysis and optimization recently is how to train deep neural networks. Mathematically, the problem is reduced to the problem of stochastic optimization, which, in turn, using the Monte Carlo method is reduced to the problem of minimizing the sum of a large number of functions. It is important to note that a similar plot is inherent in almost all tasks coming from data analysis. Almost all data analysis (machine learning) tasks are reduced to optimization problems, or rather stochastic optimization. In mathematical statistics with a known probabilistic law (but unknown parameters), and in machine learning - with an unknown probabilistic law. One of the most popular ways to solve such optimization problems and their variants obtained using the Monte Carlo method is the stochastic gradient descent method and its variations. The methods were known back in the 50s of the last century. However, the real significance of this method has been evaluated in the last twenty years in connection with the noted applications. In this report, it is planned to make a small overview of the development of this direction in recent years (adaptive step selection, butch size, federated training, etc.).


© Steklov Math. Inst. of RAS, 2024