RUS  ENG
Full version
JOURNALS // Teoriya Veroyatnostei i ee Primeneniya // Archive

Teor. Veroyatnost. i Primenen., 1980 Volume 25, Issue 1, Pages 71–82 (Mi tvp986)

This article is cited in 16 papers

An $\varepsilon$-optimal control of finite Markov chain with average reward criterion

E. A. Feĭnberg

Moscow

Abstract: Discrete time Markov decition chain with average reward criterion is considered. It is proved that if the state space is finite and the sets of actions are measurable subsets of Polish space, then there exist non-randomized Markov $\varepsilon$-optimal policies. An example showing that there exists a Markov decition chain with countable state space and finite sets of actions such that randomized Markov $\varepsilon$-optimal policies for this chain don't exist is constructed.

Received: 24.08.1978


 English version:
Theory of Probability and its Applications, 1980, 25:1, 70–81

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024