Abstract:
The work is devoted to the analysis of approaches and algorithms for the classification of text documents. The approach to the thematic classification of documents is considered. For this purpose, a specially constructed measure of the proximity of documents is used, taking into account the specifics of the subject area. The values of the weight coefficients in the formula for computing the proximity measure are determined by the assumed a priori reliability of the data of the corresponding scale.
Keywords:document, coordinate indexing, measure of proximity, nominal scale.