|
VIDEO LIBRARY |
Information Technologies and Systems 2013
|
|||
|
Randomized strategies of a multi-armed bandit based on mirror descent method A. V. Nazin Institute of Control Sciences, Russian Academy of Sciences, Moscow |
|||
Abstract: We consider the problem of a multi-armed bandit and present an optimization approach based on mirror descent. Lower and upper bounds for the difference between the mean and the minimal losses are given for a broad class of problems |