Abstract:
This paper reviews state-of-the-art reinforcement learning methods, with a focus on their
application in dynamic and complex environments. The study begins by analysing the main approaches to
reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods
and policy gradients. Special attention is given to the Generalised Adversarial Imitation Learning (GAIL)
methodology and its impact on the optimisation of agents' strategies. A study of model-free learning is
presented and criteria for selecting agents capable of operating in continuous action and state spaces are
highlighted. The experimental part is devoted to analysing the learning of agents using different types of
sensors, including visual sensors, and demonstrates their ability to adapt to the environment despite
resolution constraints. A comparison of results based on cumulative reward and episode length is
presented, revealing improved agent performance in the later stages of training. The study confirms that
the use of simulated learning significantly improves agent performance by reducing time costs and
improving decision-making strategies. The present work holds promise for further exploration of
mechanisms for improving sensor resolution and fine-tuning hyperparameters.