RUS  ENG
Full version
JOURNALS // Artificial Intelligence and Decision Making // Archive

Artificial Intelligence and Decision Making, 2022 Issue 2, Pages 27–35 (Mi iipr62)

Analysis of textual and graphical information

Methods for cross-lingual retrieval of similar documents in legal domain based on machine learning

V. V. Zhebela, D. A. Devyatkinb, D. V. Zubarevb, I. V. Sochenkovbcd

a Limited liability company "Technologies for Systems Analysis", Moscow, Russia
b Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow, Russia
c Innopolis University, Kazan, Russia
d Ivannikov Institute for System Programming of the RAS, Moscow, Russia

Abstract: The need of studying the international experience to improve legislation cause the need of information retrieval systems to be good in multilingual legal domain. One of the possible solutions is thematically similar document retrieval. However, there is an important task to transfer between languages to let the user put a document on the one language and get the search result on another one. The paper describes different approaches to solve this problem: from classical mediator-based methods to modern procedures of distributive semantics. As a test collection, we have used the UN digital library. The combination of the extended translation model and BM25 ranking function demonstrates the best results.

Keywords: cross-lingual document retrieval, distributional semantics, information retrieval in the legal domain.

DOI: 10.14357/20718594220203


 English version:
, 2023, 50:5, 494–499

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024