Yu. N. Orlov, S. A. Shilin, “Statistical text language recognition with the use of <nobr>$n$</nobr>-gram frequency”, Keldysh Institute preprints, 2017,032, 21 pp.

Statistical text language recognition with the use of $n$-gram frequency

Yu. N. Orlov, S. A. Shilin

Abstract: Statistical properties of European language texts are investigated with the use of recognition procedure for $n$-gram distribution patterns. The numerical algorithm is constructed for analysis Hurst exponent for letter distance distributions of the text fragment. The accuracy of binary recognition is estimated as 0,99.

Keywords: text language recognition, $n$-gram frequency.

DOI: 10.20948/prepr-2017-32