RUS  ENG
Full version
VIDEO LIBRARY

Information Technologies and Systems 2013
September 5, 2013 14:30, Svetlogorsk (Kaliningrad Region, Russia)


Randomized strategies of a multi-armed bandit based on mirror descent method

A. V. Nazin

Institute of Control Sciences, Russian Academy of Sciences, Moscow



Abstract: We consider the problem of a multi-armed bandit and present an optimization approach based on mirror descent. Lower and upper bounds for the difference between the mean and the minimal losses are given for a broad class of problems


© Steklov Math. Inst. of RAS, 2024