RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2024 Volume 36, Issue 1, Pages 7–22 (Mi tisp852)

Neural vector compression in approximate nearest neighbor search on large datasets

I. O. Buyanova, V. V. Yadrintseva, I. V. Sochenkovabcd

a Federal Research Center "Computer Science and Control" of Russian Academy of Sciences
b Ivannikov Institute for System Programming of the RAS
c Innopolis University
d Institute for Regenerative Medicine, I. M. Sechenov First Moscow State Medical University

Abstract: The paper examines the hypothesis of the applicability of neural autoencoders as a method of vector compression in the pipeline of approximate nearest neighbor search. The evaluation was conducted on several large datasets using various autoencoder architectures and indexes. It has been demonstrated that, although none of the combinations of autoencoders and indexes can fully outperform pure solutions, in some cases, they can be useful. Additionally, we have identified some empirical relationships between the optimal dimensionality of the hidden layer and the internal dimensionality of the datasets. It has also been shown that the loss function is a determining factor for compression quality.

Keywords: approximate nearest neighbor search, autoencoders, large datasets

DOI: 10.15514/ISPRAS-2024-36(1)-1



© Steklov Math. Inst. of RAS, 2024