Abstract:
A modified algorithm of marking out key terms from the text, a semantic model of a document corpus allowing to represent it as a graph for the following analysis, and an algorithm of a document corpus synthesis with adjusted signs using the results of the global network information retrieval are offered in the paper. The approach to process abstracts of masters’ and doctoral theses is considered. The experiment of finding out semantically similar groups in a document corpus is described.
Keywords:intelligent analysis of text data, semantic model, ontology, latent semantic analysis, cluster analysis, information retrieval system.