Abstract:
The article addresses the development of a methodology for hierarchical multi-task learning of neural networks, inspired by the ERNIE 3 architecture, and its experimental validation using the FRED-T5 model for Russian-language text analysis and generation tasks. Hierarchical multi-task learning represents a promising approach for creating universal language models capable of efficiently solving a variety of natural language processing (NLP) tasks. The proposed methodology integrates specialized encoder blocks for natural language understanding (NLU) tasks with a shared decoder for natural language generation (NLG) tasks, thus improving model performance and reducing computational costs. This paper presents a comparative analysis of the developed methodology's performance using the open Russian SuperGLUE benchmark and the pre-trained Russian-language model FRED-T5-1.7B. Experimental results confirm a significant improvement in model quality in both zero-shot and few-shot scenarios compared to the baseline configuration. Additionally, the paper explores practical applications of the developed approach in real NLP tasks and provides recommendations for further advancement of the methodology and its integration into applied systems for processing Russian-language texts.
Keywords:hierarchical multi-task learning, FRED-T5, natural language processing, neural networks, text generation, text analysis, zero-shot learning, few-shot learning, seq2seq models.