Abstract:
Collective behavior of automata is one of the directions of development of machine learning
methods. Such machines fulfil the function of goal-oriented behavior. The machine performs an action,
in response to which the environment sends its output signal to the input of the machine. The machine, in
accordance with its design, responds to this input signal with the next action. Thus, a closed loop of
interaction is built between a certain environment and the machine operating in it. This environment
itself in many cases allows for machine implementation. Effectiveness evaluation of the machine is
defined as an optimization problem of maximizing the sum of positive signals (rewards), or minimizing
negative signals (penalties), received from the environment, over the considered period of time.
Formalization of both the properties of the environment and the actions of the machines, as well as
processing of the obtained results is performed using the apparatus of game theory. In this case, signals
from the environment are conveniently represented as the sums of the winnings and losses of the players-machines. In this paper, a comparison of machines of different designs is carried out, since the efficiency
of machine reactions is determined not only by the properties of the environment, but also by such
parameters as the type and depth of memory.
Keywords:automaton, expedient behavior, optimal strategy, memory depth, game theory,
formalization of the environment, dynamic environment