Institute of Economics and Law of Academy of Sciences of GSSR, Tbilisi
Abstract:
The paper deals with a controlled Markov chain with a finite number of states $s\in S$ and a finite number of decisions $a\in A$. The optimality criterion is defined by $\mathbf E^{\pi}\widetilde L$, where $\widetilde L$ is a functional invariant with respect to shifts of the trajectory $(s_n,a_n;\,n\ge 1)$, and can be approximated, for small break probabilities, by the criterion defined by $\mathbf E^{\pi}c(s_{\tau},a_{\tau})$. Existence of an optimal stationary policy is proved, and a method for its construction is given.