V. I. Budzko, V. V. Yadrintsev, I. V. Sochenkov, V. I. Korolev, V. G. Belenkov, “Extraction of confidentiality markers from texts under conditions of high uncertainty in systems with data intensive usage”, Inform. Primen., 2020, Volume 14, Issue 4,Pages <nobr>69

This article is cited in 1 paper

Extraction of confidentiality markers from texts under conditions of high uncertainty in systems with data intensive usage

V. I. Budzko^a, V. V. Yadrintsev^ab, I. V. Sochenkov^a, V. I. Korolev^a, V. G. Belenkov^a

^a Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
^b Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation

Abstract: The main tasks, the results of the solution of which are reflected in the article, are associated with the formation of confidentiality markers when they are used in data-intensive systems under conditions when the composition and structure of the protected information cannot be determined in advance due to the lack of data or the high dynamics of their change, or their definition is not advisable due to the large number of entities whose information is subject to protection. In this paper, an approach is proposed for the formation of confidentiality markers for text materials in the indicated conditions. The article presents the semantic text analysis, which forms confidentiality markers when used to ensure information security in data-intensive systems under high uncertainty in the composition and structure of protected information. The obtained experimental results show that practical implementation of the considered approach in data-intensive systems is promising.

Keywords: confidentiality marker, information security, data-intensive domains, topical cluster, semantics, data leak prevention, intelligent security tasks, text classification, detection of text reuse.

Received: 23.06.2020

DOI: 10.14357/19922264200410