RUS  ENG
Full version
JOURNALS // Sistemy i Sredstva Informatiki [Systems and Means of Informatics] // Archive

Sistemy i Sredstva Inform., 2019 Volume 29, Issue 3, Pages 52–65 (Mi ssi654)

This article is cited in 1 paper

Clustering method of news media reports based on conceptual analysis

V. N. Zakharova, R. R. Musabaevb, A. M. Krasovitskiyb, Ya. D. Kozlovskayac, Al-dr A. Khoroshilovd, Al-ey A. Khoroshilove

a Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119133, Russian Federation
b Institute of Information and Computational Technologies, 125 Pushkin Str., Almaty 050010, Kazakhstan
c Moscow Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125993, Russian Federation
d Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119133, Russian Federation
e The 27th Central Research Institute of the Ministry of Defence of the Russian Federation, 5, 1st Khoroshevsky Passage, Moscow 123007, Russian Federation

Abstract: The article describes the solution of a clustering news media reports based on the technique developed by authors of automatic calculation of a measure of semantic meaningfulness of the names of concepts of documents using their statistical, syntactic, and semantic features and technologies of automatic generation of declarative means for clustering documents based on the methods of their semantic-syntactic and conceptual analysis. On the basis of the suggested technique of calculation of a measure of semantic meaningfulness of the names of concepts and the software and declarative means created by the study process, an experiment was conducted to process a representative array of news media reports. The analysis of the results showed that the use of semantic correlating coefficients of concepts improves the accuracy of establishing semantic similarity between documents at automatically establishing the semantic meaningfulness of textual names of concepts.

Keywords: text clustering, semantic-syntactic analysis, conceptual analysis, declarative means, statistical measure of meaningfulness of textual names of documents, semantic correlating coefficient, semantic similarity between documents.

Received: 23.07.2019

DOI: 10.14357/08696527190305



© Steklov Math. Inst. of RAS, 2024