RUS  ENG
Full version
JOURNALS // Informatika i Ee Primeneniya [Informatics and its Applications] // Archive

Inform. Primen., 2015 Volume 9, Issue 2, Pages 92–110 (Mi ia373)

This article is cited in 4 papers

Associative portraits of subject areas as a tool for automated construction of big data systems for knowledge extraction: theory, methods, visualization, and application

I. V. Galina, E. B. Kozerenko, Yu. I. Morozova, N. V. Somin, M. M. Charnine

Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow, 119333, Russian Federation

Abstract: The paper presents the technique of developing systems for extraction of knowledge which employs the approach of automated association portrait of a subject area (APSA) formation and building a semantic context space (SCS). The ideology of the APSA is based on the distributional hypothesis claiming that semantically equal (or related) lexemes have a similar context and, vice versa, in a similar context, the lexemes are semantically close. The model uses an extended hypothesis that consists in the investigation of similarities and differences in contexts not only of individual words, but of arbitrary multilexeme fragments of meaningful word-combinations. The examples of implemented projects for different subject domains are given.

Keywords: semantic modeling; associations; mathematical statistics; distributive semantics; big data; automated extraction of knowledge; digital natural language text corpora; semantic search; intelligent Internet technology.

Received: 21.04.2015

DOI: 10.14357/19922264150211



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2025