Abstract:
The modified multi-armed bandit problem is formulated in the paper which allows the player to use so-called expert hints in the decision making process. As a player in this problem is meant some automated system that uses a certain strategy (algorithm) for making a decision under conditions of uncertainty. The approach is developed for the case of $m$ experts. A modification of the well-known UCB1 algorithm is proposed to solve the multi-armed bandit problem. The results of a numerical experiment are given in order to show influence of expert hints on the player's payoff.