RUS  ENG
Full version
JOURNALS // Taurida Journal of Computer Science Theory and Mathematics // Archive

Taurida Journal of Computer Science Theory and Mathematics, 2017 Issue 1, Pages 32–41 (Mi tvim12)

Kolmogorov complexity and the $VC$ dimension of families of recursive functions

V. I. Donskoy

Crimea Federal University, Simferopol

Abstract: Kolmogorov's definition of algorithmic complexity of constructive objects is applicable not only to character strings, but also to the functions, function collections, and many other objects. If we define Kolmogorov complexity of a family of recursive functions, it will be its entropy characteristics. Introduced by different ways measures of a complexity of families of functions are of great theoretical and practical interest. In particular – combinatorial dimension or the dimension of Vapnik-Chervonenkis that for an arbitrary family $\mathfrak{F}$ is denoted as $VCD(\mathfrak{F})$ is a measure of the diversity or entropy of this family.
The main idea of this article consists in the following. According to Kolmogorov's approach it must be assumed that the complexity of a family of functions $\mathfrak{F}$ must be the smallest length of a binary word, which is received as input, the optimal machine-decompressor will allow to completely restore any function of the family $\mathfrak{F}$. So determined Kolmogorov complexity of the family, denoted further $KC(\mathfrak{F})$, will be a measure of its uncertainty – Kolmogorov entropy. If $\mathfrak{F}$ is a finite collection of computable functions and $KC(\mathfrak{F}) =len(p)$, where $p$ – the binary word, $len(p)$ – the length of $p$, one can specify a constructive method of constructing upper bounds of the Kolmogorov complexity $KC(\mathfrak{F})$ consisting in the fact that on the basis of precise original description of the class $\mathfrak{F}$ it is constructed (encoded or, one might say, programmed) as short as possible description of this class in the form of word $p'$. In this way mathematicians do while programming data structures. It is necessary that the word $p'$ would be possible by using some decompressor to recover any function of the family $\mathfrak{F}$, and then $KC(\mathfrak{F})\leq len(p')$.
Previously [4] the author introduced the definition of the conditional Kolmogorov complexity $K_l(\mathfrak{F})$ of an arbitrary family of recursive functions $\mathfrak{F}$ relative of a set of training samples of the length $l$. By use this definition, the inequalities
\begin{equation} \label{OldContrA} VCD(\mathfrak{F})\leq K_l(\mathfrak{F})< VCD(\mathfrak{F})\log l, \end{equation}
was proven, just in case when both the conditions
$$ \mathrm{i}) \ \ VCD(\mathfrak{F})\geq 2;$$

\begin{equation} \label{Cond2A} \mathrm{ii}) \ \ l > VCD(\mathfrak{F}) \end{equation}
have a place. The result (\ref{OldContrA}) was useful because it gave lower and upper estimates of the conditional Kolmogorov complexity of a family of functions and let to prove the theorem on non-computability of $VC$ dimension of arbitrary family of recursive functions. However, exhaustive justification of the $pVCD$ method [6] of estimation of $VC$ dimension was prevented by the condition (\ref{Cond2A}).
In the paper, a new definition of Kolmogorov complexity $KC(\mathfrak{F})$ is presented, and the inequality $VCD(\mathfrak{F})\leq KC(\mathfrak{F})$ is proved. This inequality allows to use $len(p')$ as an estimation of the Kolmogorov complexity of the family $\mathfrak{F}$ and as the upper estimation of its capacity $VCD(\mathfrak{F})$. Estimations of $VCD$ of some families of functions obtained by $pVCD$ method (and for the comparison – obtained by other methods) are given.

Keywords: $VC$ dimension, Kolmogorov complexity, recursive functions.

UDC: 519.7

MSC: 68R01



© Steklov Math. Inst. of RAS, 2024