Abstract:
The paper deals with coding of arbitrary finite sequences of symbols. Identical parts (consisting of two or more symbols) are to be found. Bach of them is abbreviated by means of a special symbol, the abbreviation is executed and the information about the abbreviation is added to the resulting sequence. This operation can reduce the total number of symbols. This coding procedure is formalized by means of context-free grammars with only one deducible word (ad hoc context-free grammars of [1]). Let the sequence be longer than $4n^2+3n+1$ where $n$ is the length of the alphabet the initial sequence is written down in. Then the sequence is reducible and the bound is exact. The length of any code of a sequence is proved to be not less than $3\log_2z-2$, where $z$ is the length of the sequence. This is also the exact bound.
The investigation of abbreviation techniques dealing with identical or similar groups of symbols may prove useful for the solution of problems associated with extrapolation of arbitrary sequences.