RUS  ENG
Full version
JOURNALS // Matematicheskie Zametki // Archive

Mat. Zametki, 2017 Volume 101, Issue 4, Pages 531–548 (Mi mzm11531)

This article is cited in 9 papers

The Relationship between the Fermi–Dirac Distribution and Statistical Distributions in Languages

V. P. Maslov

National Research University "Higher School of Economics" (HSE), Moscow

Abstract: In this article, we study, from the mathematical point of view, the analogies between language and multi-particle systems in thermodynamics. We attempt to introduce an appropriate mathematical apparatus and the technical tools of statistical physics to descriptions of language. In particular, we apply the notions of number of degrees of freedom, Bose condensate, phase transition and others to linguistics objects. On the basis of a statistical analysis of dictionaries and statistical distributions in languages, we conjecture that the transition from the semiotic communication system of the higher primates to human language can be described as a phase transition of the first kind. We show that the number of words appearing with frequency 1 in a corpus of texts is equal to the number of ones in the corresponding Fermi–Dirac distribution, while the high frequency of stop-words corresponds to the large number of particles in the Bose condensate, when the number of degrees of freedom is less than two, provided there is a gap in the spectrum. The presented considerations are illustrated by examples from the Russian language. Some of the illustrative examples are untranslatable into English, and so they were replaced in translation by similar examples from the English language.

Keywords: number of degrees of freedom, frequency of occurrence, frequency dictionary, Zipf law, statistical language distribution, Bose–Einstein distribution, Fermi–Dirac distribution, number theory, Van-der-Waals model, isotherm, tropical topology, tropical analysis, telegraphic style, stop-word.

UDC: 511.3+81'32

Received: 11.11.2016

DOI: 10.4213/mzm11531


 English version:
Mathematical Notes, 2017, 101:4, 645–659

Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024