RUS  ENG
Full version
JOURNALS // Informatika i Ee Primeneniya [Informatics and its Applications] // Archive

Inform. Primen., 2016 Volume 10, Issue 1, Pages 106–118 (Mi ia408)

This article is cited in 24 papers

Representation of cross-lingual knowledge about connectors in supracorpora databases

I. M. Zatsmana, O. Yu. Inkovaba, M. G. Kruzhkova, N. A. Popkovaa

a Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
b University of Geneva, 22 Bd des Philosophes, CH-1205 Geneva 4, Switzerland

Abstract: The article considers “supracorpora databases”, which are used in contrastive linguistic studies. Such databases result from processing of parallel texts from bilingual parallel subcorpora within the Russian National Corpus. Each of these parallel texts contains either one original Russian text with one or more translations into a foreign language, or one original text in a foreign language with one translation into Russian. Every source text is aligned with its translation(s) at the level of sentences. Supracorpora databases are a new type of linguistic resources designed for goal-oriented discovery of new knowledge about various linguistic units. This knowledge is needed to improve the quality of machine translation, to update monolingual and bilingual grammars, and to modernize a wide range of academic courses in such fields as linguistics and translation studies. The article describes the underlying conceptual foundations of the database and gives an example of how it can be implemented to represent knowledge about Russian connectors and their French translation correspondences.

Keywords: cross-lingual studies; Russian connectors; representation of knowledge about connectors; supracorpora databases.

Received: 28.01.2016

DOI: 10.14357/19922264160110



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024