RUS  ENG
Full version
JOURNALS // Proceedings of Machine Learning Research (PMLR) // Archive

Proc. Mach. Learn. Res. (PMLR), 2024, Volume 244, Pages 2527–2536 (Mi pmlr2)

Quantization of large language models with an overdetermined basis

Daniil Merkulovab, Daria Cherniuka, Alexander Rudikovac, Ivan Oseledetsacd, Ekaterina Muravlevaae, Aleksandr Mikhaleva, Boris Kashincf

a Skolkovo Institute of Science and Technology, Moscow, Russia
b Moscow Institute of Physics and Technology, Moscow, Russia
c Steklov Mathematical Institute of Russian Academy of Sciences, Moscow, Russia
d Artificial Intelligence Research Institute, Moscow, Russia
e Sberbank PJSC, Vavilova st., 19, 117312, Moscow, Russia
f M. V. Lomonosov Moscow State University, Moscow, Russia

Language: English


ArXiv: 2404.09737


© Steklov Math. Inst. of RAS, 2025