A. V. Kolnogorov, “Parallel design of robust control in the stochastic environment (the two-armed bandit problem)”, Avtomat. i Telemekh., 2012, Issue 4,Pages <nobr>114

This article is cited in 15 papers

Robust and Adaptive Systems

Parallel design of robust control in the stochastic environment (the two-armed bandit problem)

A. V. Kolnogorov

Yaroslav-the-Wise Novgorod State University, Velikii Novgorod, Russia

Abstract: The problem of rational behavior in the stochastic environment, also known as the two armed bandit problem, is considered in the robust (minimax) setting. A parallel strategy is proposed leading to control, which is arbitrary close to the optimal one for environments with gains having gaussian cumulative distribution functions with unit variance. The invariant recursive equation is obtained for computing the minimax strategy and risk, which are to be found as Bayesian ones associated with the worst-case a priori distribution. As a result, the well-known Vogel's estimates of the minimax risk can be improved. Numerical experiments show that the strategy is efficient in the environments with non-gaussian distributions, e.g., the binary ones.

Presented by the member of Editorial Board: A. V. Nazin

Received: 24.11.2010