Abstract:
The maximal length of longest common subsequence (LCS) for a couple of random finite sequences over an alphabet of 4 characters was considered as a random function of the sequences lengths $m$ and $n$.
Exact probability distributions tables are presented for all couples of length in a range $2<m+n<19$.
The graphs of expected value and standard deviation as a functions of length are shown in linear perspective which presents the behaviour of large lengths at the horizon.
In order to illustrate behaviour on large lengths, the results of numeric simulation for $m+n=32$, 512, 8192 and 131072 are also shown on the same graphs.
The presented graph of expected value dependency of $m$ and $n$ looks to have asymptotic right circular cone.
The variance looks alike growing as $(n+m)^{\frac34}$.
Key words and phrases:similarity of strings, sequence alignment, edit distance, LCS, Levenshtein metric.