Abstract:
Multilingual Large Language Models (LLMs) often exhibit degraded performance for languages other than English due to the imbalance in their training data. Directly adapting these models to a new language, such as Russian, carries the risk of catastrophic forgetting their original capabilities and demands significant computational resources. In this paper, we introduce Ruadapt – a comprehensive and computationally efficient methodology for the language adaptation of LLMs, featuring tokenizer replacement. A full adaptation of a single Qwen3-8B model variant with our methodology requires less than 2000 GPU-hours, while subsequent adaptations of other variants are up to 10 times less resource-intensive due to the modular nature of the procedure's steps. An optimal configuration achieves up to an 80% speed-up in generation, with full preservation of long-context capabilities and only a minor degradation in instruction-following performance. We conduct a detailed empirical study of each adaptation step to identify optimal hyperparameters and to assess the impact of each key stage on the final quality. These resulting guidelines are implemented in the current generation of Ruadapt models, such as RuadaptQwen3-32B-Hybrid. We are open-sourcing our models, code, and datasets to provide the research community with a validated and cost-effective strategy for developing high-quality, language-specific models.
Keywords:large language models, language adaptation, Russian language.