RUS  ENG
Full version
JOURNALS // Sistemy i Sredstva Informatiki [Systems and Means of Informatics] // Archive

Sistemy i Sredstva Inform., 2023 Volume 33, Issue 1, Pages 24–34 (Mi ssi867)

Integration capacities of supracorpora databases

A. A. Durnovoa, O. Yu. Inkovaab, V. A. Nurieva

a Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
b University of Geneva, 22 Bd des Philosophes, CH-1205 Geneva 4, Switzerland

Abstract: The paper centers on integration capacities of supracorpora databases developed at the Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences. It is shown how three databases — the Supracorpora database of hierarchal logical-semantic relations (SDBH LSR), the Database of parallel texts (DBT), and the Supracorpora database of connectives (SDBC) — are integrated between themselves. The information system of hierarchal logical-semantic relations uses a specially designed database (SDBH LSR) in which annotations of logical-semantic relations are presented as trees, i. e., directed connected acyclic graphs where nodes contain data and edges depict the subordination between nodes. Along with the SDBH LSR, ISHLSR uses data from the DBT and SDBC. Such integration makes it possible to combine the methodological strengths of the informatics, contrastive and corpus linguistics, and theory and practice of translation without losing sight of the factors that may adversely affect the validity and reliability of the final data.

Keywords: supracorpora database, integration of databases, multilingual corpus, parallel corpus, corpus-based information resources, translation studies, contrastive linguistics, machine translation.

Received: 11.10.2022

DOI: 10.14357/08696527230103



© Steklov Math. Inst. of RAS, 2024