RUS  ENG
Full version
JOURNALS // Informatika i Ee Primeneniya [Informatics and its Applications] // Archive

Inform. Primen., 2022 Volume 16, Issue 2, Pages 109–117 (Mi ia793)

Controlling a bounded two-dimensional Markov chain with a given invariant measure

M. G. Konovalov, R. V. Razumchik

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Consideration is given to the two-dimensional discrete-time Markov chain (random walk) with the bounded continuous state space (rectangle). Upon each transition, depending on its current position and if not on the boundary, the chain moves in one of four possible directions (north, south, east, or west). Having selected a direction, the length of the jump within the admissible interval is determined by the random variable. Assuming that some (reference) distribution on the state space is given, one seeks to solve the inverse control problem, i. e., to find such a control strategy (probabilities of choosing either direction) which brings the stationary distribution of the chain close (in a certain sense) to the reference distribution. The solution based on the policy gradient method is proposed. Illustrative examples are provided.

Keywords: Markov chain control, continuous state space, policy gradient, unmanned air vehicles.

Received: 17.04.2022

DOI: 10.14357/19922264220214



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024