Abstract:
The article describes the solution of a clustering news media reports based on
the technique developed by authors of automatic calculation of a measure of semantic
meaningfulness of the names of concepts of documents using their statistical,
syntactic, and semantic features and technologies of automatic generation of
declarative means for clustering documents based on the methods of their
semantic-syntactic and conceptual analysis. On the basis of the suggested technique of
calculation of a measure of semantic meaningfulness of the names of concepts and
the software and declarative means created by the study process, an experiment was
conducted to process a representative array of news media reports. The analysis of
the results showed that the use of semantic correlating coefficients of concepts
improves the accuracy of establishing semantic similarity between documents at
automatically establishing the semantic meaningfulness of textual names of
concepts.
Keywords:text clustering, semantic-syntactic analysis, conceptual analysis, declarative means, statistical measure of meaningfulness of textual names of documents, semantic correlating coefficient, semantic similarity between documents.