RUS  ENG
Full version
JOURNALS // Zapiski Nauchnykh Seminarov POMI // Archive

Zap. Nauchn. Sem. POMI, 2024 Volume 540, Pages 214–232 (Mi znsl7552)

MMA: a fight for multilingual models acceleration

N. Sukhanovskii, M. Ryndin

Ivannikov Institute for System Programming of the Russian Academy of Sciences, Moscow, Russia

Abstract: In this work we focus on common NLP model design: fine-tuning a multilingual language model with data for the target task in one language to solve this task in a different target language. We aim to determine how popular speedup techniques affect multilingual capabilities of Transformer-based model and additionally research the usage of this techniques in combination. As a result, we obtain the NERC model that can be effectively inferred on CPU and keeps multilingual properties across several test languages after being tuned and accelerated with only English data available.

Key words and phrases: BERT, pruning, quantization, NERC.

Received: 15.11.2024

Language: English



© Steklov Math. Inst. of RAS, 2025