N. Sukhanovskii, M. Ryndin, “MMA: a fight for multilingual models acceleration”, Zap. Nauchn. Sem. POMI, 2024, Volume 540,Pages <nobr>214

MMA: a fight for multilingual models acceleration

N. Sukhanovskii, M. Ryndin

Ivannikov Institute for System Programming of the Russian Academy of Sciences, Moscow, Russia

Abstract: In this work we focus on common NLP model design: fine-tuning a multilingual language model with data for the target task in one language to solve this task in a different target language. We aim to determine how popular speedup techniques affect multilingual capabilities of Transformer-based model and additionally research the usage of this techniques in combination. As a result, we obtain the NERC model that can be effectively inferred on CPU and keeps multilingual properties across several test languages after being tuned and accelerated with only English data available.

Key words and phrases: BERT, pruning, quantization, NERC.

Received: 15.11.2024

Language: English