Аннотация:
The average reward Markov decision problem with finite state and action spaces is considered and an approach for determining the optimal pure and mixed stationary strategies for this problem is proposed. We show that the considered problem can be formulated in terms of stationary strategies where the objective function is quasi-monotonic (i.e. it is quasi-convex and quasi-concave) on the feasible set of stationary strategies. Using such a quasi-monotonic programming model with linear constraints we ground algorithms for determining the optimal pure and mixed stationary strategies for the average Markov decision problem.
Ключевые слова и фразы:Markov decision processes, average optimization criterion, stationary strategies, optimal strategies.