Abstract:
In recent years, the Bellman optimality theory branch known as reinforcement learning has been enriched with efficient algorithms that have found wide application in various fields, including space flight mechanics. These methods are based on approximate dynamic programming algorithms, optimization methods for functions with a large number of variables, and the theory of partially observable Markov decision-making processes. Their advantage over many other management methods is a significant reduction in mathematical assumptions and a wide range of tasks to be solved. Numerous examples demonstrate that management strategies developed using these methods are able to adapt to unknown or changing parameters of the apparatus and the external environment. The author's review of the application of these methods to spacecraft control tasks has revealed a common methodology for constructing such strategies.
The report presents a general methodology for converting the optimal control problem of mechanical systems into a machine learning problem with reinforcement, as well as a software architecture for numerically solving such problems. The problem of maintaining motion near unstable halo orbits in the vicinity of lunar libration points is considered. We study both purely neural network models of device control and hybrid models, where the Floquet mod method or the Cauchy-Green method developed by the author is used, and the neural network is used as an additive component for control optimization.
The work was carried out with the financial support of the Russian Science Foundation (project No. 24-71-00032).