RUS  ENG
Full version
JOURNALS // Avtomatika i Telemekhanika // Archive

Avtomat. i Telemekh., 2012 Issue 4, Pages 114–130 (Mi at3793)

This article is cited in 15 papers

Robust and Adaptive Systems

Parallel design of robust control in the stochastic environment (the two-armed bandit problem)

A. V. Kolnogorov

Yaroslav-the-Wise Novgorod State University, Velikii Novgorod, Russia

Abstract: The problem of rational behavior in the stochastic environment, also known as the two armed bandit problem, is considered in the robust (minimax) setting. A parallel strategy is proposed leading to control, which is arbitrary close to the optimal one for environments with gains having gaussian cumulative distribution functions with unit variance. The invariant recursive equation is obtained for computing the minimax strategy and risk, which are to be found as Bayesian ones associated with the worst-case a priori distribution. As a result, the well-known Vogel's estimates of the minimax risk can be improved. Numerical experiments show that the strategy is efficient in the environments with non-gaussian distributions, e.g., the binary ones.

Presented by the member of Editorial Board: A. V. Nazin

Received: 24.11.2010


 English version:
Automation and Remote Control, 2012, 73:4, 689–701

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024