Z. A. Volovikova, M. A. Kuznetsova, A. A. Skrynnik, A. I. Panov, “Review of multimodal environments for reinforcement learning”, Dokl. RAN. Math. Inf. Proc. Upr., 2024, Volume 520, Number 2,Pages <nobr>124

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

Review of multimodal environments for reinforcement learning

Z. A. Volovikova^ab, M. A. Kuznetsova^a, A. A. Skrynnik^bc, A. I. Panov^abc

^a Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow Region
^b Artificial Intelligence Research Institute, Moscow, Russia
^c Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow, Russia

Abstract: This article presents a review and comparative analysis of multimodal virtual environments for reinforcement learning. Seven different environments are considered, including the HomeGrid, BabyAI, RTFM, Messenger, Touchdown, Alfred, and IGLU, and research is focused on their peculiarities and requirements to agents. The main attention is paid to such parameters as complexity of text instructions and the dynamic properties of the environment. The conducted analysis identifies the strengths and weaknesses of each environment, which allows determining the optimal conditions for effective agent training, and also emphasizes the need to create more balanced environments combining high requirements to both understanding of language and interaction with the surrounding.

Keywords: multimodal learning, language grounding, reinforcement learning.

UDC: 004.5

Received: 01.10.2024
Accepted: 07.10.2024

DOI: 10.31857/S2686954324700449