Abstract:
The paper is devoted to the study of the publicity style of F. M. Dostoevsky on the basis of publications in the journals “Time” and “Epoch” (1861–1865). For this, fragments of texts (including other authors: M. M. Dostoevsky, N. N. Strakhov, A. A. Golovachev, etc.) were selected in sizes of 500, 700 and 1000 words, on which the occurrence of bigrams and trigrams (encoded sequences of parts of speech) were counted. Decision trees were built on their basis and an analysis of the accuracy of text recognition was performed. If we consider the class cation at the rest level of the tree (fragment size 1000), then the accuracy was on average 87 resulting decision trees.
Keywords:publicity style, text attribution, decision tree, $n$-gram, F. M. Dostoevsky, information system “Statistical methods for analyzing literary texts”, tree matching.