RUS
ENG
Full version
JOURNALS
// Proceedings of Machine Learning Research (PMLR)
// Archive
Proc. Mach. Learn. Res. (PMLR), 2024, Volume 244,
Pages
2527–2536
(Mi pmlr2)
Quantization of large language models with an overdetermined basis
Daniil Merkulov
ab
,
Daria Cherniuk
a
,
Alexander Rudikov
ac
,
Ivan Oseledets
acd
,
Ekaterina Muravleva
ae
,
Aleksandr Mikhalev
a
,
Boris Kashin
cf
a
Skolkovo Institute of Science and Technology, Moscow, Russia
b
Moscow Institute of Physics and Technology, Moscow, Russia
c
Steklov Mathematical Institute of Russian Academy of Sciences, Moscow, Russia
d
Artificial Intelligence Research Institute, Moscow, Russia
e
Sberbank PJSC, Vavilova st., 19, 117312, Moscow, Russia
f
M. V. Lomonosov Moscow State University, Moscow, Russia
Language:
English
ArXiv:
2404.09737
©
Steklov Math. Inst. of RAS
, 2025