RUS  ENG
Full version
JOURNALS // Informatika i Ee Primeneniya [Informatics and its Applications] // Archive

Inform. Primen., 2007 Volume 1, Issue 1, Pages 54–65 (Mi ia120)

This article is cited in 3 papers

Linguistic simulation for machine translation and knowledge management systems

E. B. Kozerenko

Institute for Problems of Informatics RAS

Abstract: This paper is dedicated to the vital problems of creating semantic-syntactic presentations for the systems of machine translation and extraction of knowledge from natural language texts. The purpose of our studies is the construction of an integral linguistic model on the basis of a synergetic approach, which uses linguistic knowledge, statistical methods, and mechanisms of machine learning for the extraction of new grammar rules from text corpora and disambiguation of language structures. To formalize linguistic knowledge, we have developed a new Cognitive Transfer Grammar which is a semantically motivated version of a generative unification grammar. For the preparation of system training components and obtaining statistical data about language structures, a multilingual resource is being created, comprising a Treebank and a corpus of semantically aligned parallel texts in Russian, English, and a number of other European languages.

Keywords: machine translation; grammar formalisms; linguistic model; parallel texts alignment; semantics; syntax; phrase structure.



© Steklov Math. Inst. of RAS, 2024